2010 |
Farré, Domènec, Albà, M Mar Heterogeneous patterns of gene-expression diversification in mammalian gene duplicates. (Article) Molecular biology and evolution, 27 (2), pp. 325–35, 2010, ISSN: 1537-1719. (Abstract | Links | BibTeX | Tags: Animals, Evolution, Gene Duplication, Genetic, Humans, Mammals, Mammals: genetics, Models, Molecular) @article{Farre2010, title = {Heterogeneous patterns of gene-expression diversification in mammalian gene duplicates.}, author = {Farré, Domènec and Albà, M Mar}, url = {http://www.ncbi.nlm.nih.gov/pubmed/19822635}, issn = {1537-1719}, year = {2010}, date = {2010-01-01}, journal = {Molecular biology and evolution}, volume = {27}, number = {2}, pages = {325--35}, abstract = {Gene duplication is a major mechanism for molecular evolutionary innovation. Young gene duplicates typically exhibit elevated rates of protein evolution and, according to a number of recent studies, increased expression divergence. However, the nature of these changes is still poorly understood. To gain novel insights into the functional consequences of gene duplication, we have undertaken an in-depth analysis of a large data set of gene families containing primate- and/or rodent-specific gene duplicates. We have found a clear tendency toward an increase in protein, promoter, and expression divergence with increasing number of duplication events undergone by each gene since the human-mouse split. In addition, gene duplication is significantly associated with a reduction in expression breadth and intensity. Interestingly, it is possible to identify three main groups regarding the evolution of gene expression following gene duplication. The first group, which comprises around 25% of the families, shows patterns compatible with tissue-expression partitioning. The second and largest group, comprising 33-53% of the families, shows broad expression of one of the gene copies and reduced, overlapping, expression of the other copy or copies. This can be attributed, in most cases, to loss of expression in several tissues of one or more gene copies. Finally, a substantial number of families, 19-35%, maintain a very high level of tissue-expression overlap (>0.8) after tens of millions of years of evolution. These families may have been subject to selection for increased gene dosage.}, keywords = {Animals, Evolution, Gene Duplication, Genetic, Humans, Mammals, Mammals: genetics, Models, Molecular} } Gene duplication is a major mechanism for molecular evolutionary innovation. Young gene duplicates typically exhibit elevated rates of protein evolution and, according to a number of recent studies, increased expression divergence. However, the nature of these changes is still poorly understood. To gain novel insights into the functional consequences of gene duplication, we have undertaken an in-depth analysis of a large data set of gene families containing primate- and/or rodent-specific gene duplicates. We have found a clear tendency toward an increase in protein, promoter, and expression divergence with increasing number of duplication events undergone by each gene since the human-mouse split. In addition, gene duplication is significantly associated with a reduction in expression breadth and intensity. Interestingly, it is possible to identify three main groups regarding the evolution of gene expression following gene duplication. The first group, which comprises around 25% of the families, shows patterns compatible with tissue-expression partitioning. The second and largest group, comprising 33-53% of the families, shows broad expression of one of the gene copies and reduced, overlapping, expression of the other copy or copies. This can be attributed, in most cases, to loss of expression in several tissues of one or more gene copies. Finally, a substantial number of families, 19-35%, maintain a very high level of tissue-expression overlap (>0.8) after tens of millions of years of evolution. These families may have been subject to selection for increased gene dosage. |
2004 |
Gibbs, Richard A, Et al. Genome sequence of the Brown Norway rat yields insights into mammalian evolution. (Article) Nature, 428 (6982), pp. 493–521, 2004, ISSN: 1476-4687. (Abstract | Links | BibTeX | Tags: Animals, Base Composition, Centromere, Centromere: genetics, Chromosomes, CpG Islands, CpG Islands: genetics, DNA, DNA Transposable Elements, DNA Transposable Elements: genetics, Evolution, Gene Duplication, Genome, Genomics, Humans, Inbred BN, Inbred BN: genetics, Introns, Introns: genetics, Male, Mammalian, Mammalian: genetics, Mice, Mitochondrial, Mitochondrial: genetics, Models, Molecular, Mutagenesis, Nucleic Acid, Nucleic Acid: genetics, Polymorphism, Rats, Regulatory Sequences, Retroelements, Retroelements: genetics, RNA, RNA Splice Sites, RNA Splice Sites: genetics, Sequence Analysis, Single Nucleotide, Single Nucleotide: genetics, Telomere, Telomere: genetics, Untranslated, Untranslated: genetics) @article{Gibbs2004, title = {Genome sequence of the Brown Norway rat yields insights into mammalian evolution.}, author = {Gibbs, Richard A and Et al.}, url = {http://www.ncbi.nlm.nih.gov/pubmed/15057822}, issn = {1476-4687}, year = {2004}, date = {2004-01-01}, journal = {Nature}, volume = {428}, number = {6982}, pages = {493--521}, abstract = {The laboratory rat (Rattus norvegicus) is an indispensable tool in experimental medicine and drug development, having made inestimable contributions to human health. We report here the genome sequence of the Brown Norway (BN) rat strain. The sequence represents a high-quality 'draft' covering over 90% of the genome. The BN rat sequence is the third complete mammalian genome to be deciphered, and three-way comparisons with the human and mouse genomes resolve details of mammalian evolution. This first comprehensive analysis includes genes and proteins and their relation to human disease, repeated sequences, comparative genome-wide studies of mammalian orthologous chromosomal regions and rearrangement breakpoints, reconstruction of ancestral karyotypes and the events leading to existing species, rates of variation, and lineage-specific and lineage-independent evolutionary events such as expansion of gene families, orthology relations and protein evolution.}, keywords = {Animals, Base Composition, Centromere, Centromere: genetics, Chromosomes, CpG Islands, CpG Islands: genetics, DNA, DNA Transposable Elements, DNA Transposable Elements: genetics, Evolution, Gene Duplication, Genome, Genomics, Humans, Inbred BN, Inbred BN: genetics, Introns, Introns: genetics, Male, Mammalian, Mammalian: genetics, Mice, Mitochondrial, Mitochondrial: genetics, Models, Molecular, Mutagenesis, Nucleic Acid, Nucleic Acid: genetics, Polymorphism, Rats, Regulatory Sequences, Retroelements, Retroelements: genetics, RNA, RNA Splice Sites, RNA Splice Sites: genetics, Sequence Analysis, Single Nucleotide, Single Nucleotide: genetics, Telomere, Telomere: genetics, Untranslated, Untranslated: genetics} } The laboratory rat (Rattus norvegicus) is an indispensable tool in experimental medicine and drug development, having made inestimable contributions to human health. We report here the genome sequence of the Brown Norway (BN) rat strain. The sequence represents a high-quality 'draft' covering over 90% of the genome. The BN rat sequence is the third complete mammalian genome to be deciphered, and three-way comparisons with the human and mouse genomes resolve details of mammalian evolution. This first comprehensive analysis includes genes and proteins and their relation to human disease, repeated sequences, comparative genome-wide studies of mammalian orthologous chromosomal regions and rearrangement breakpoints, reconstruction of ancestral karyotypes and the events leading to existing species, rates of variation, and lineage-specific and lineage-independent evolutionary events such as expansion of gene families, orthology relations and protein evolution. |
Castresana, Jose, Guigó, Roderic, Albà, M Mar Journal of molecular evolution, 59 (1), pp. 72–9, 2004, ISSN: 0022-2844. (Abstract | Links | BibTeX | Tags: Base Composition, Base Composition: genetics, Chromatin, Chromatin: metabolism, Chromosomes, Computational Biology, Databases, DNA-Binding Proteins, DNA-Binding Proteins: genetics, DNA-Binding Proteins: metabolism, Evolution, Genetic, Genome, human, Humans, Introns, Introns: genetics, Models, Molecular, Multigene Family, Multigene Family: genetics, Pair 19, Pair 19: genetics, Phylogeny, Zinc Fingers, Zinc Fingers: genetics) @article{Castresana2004, title = {Clustering of genes coding for DNA binding proteins in a region of atypical evolution of the human genome.}, author = {Castresana, Jose and Guigó, Roderic and Albà, M Mar}, url = {http://www.ncbi.nlm.nih.gov/pubmed/15383909}, issn = {0022-2844}, year = {2004}, date = {2004-01-01}, journal = {Journal of molecular evolution}, volume = {59}, number = {1}, pages = {72--9}, abstract = {Comparison of the human and mouse genomes has revealed that significant variations in evolutionary rates exist among genomic regions and that a large part of this variation is interchromosomal. We confirm in this work, using a large collection of introns, that human chromosome 19 is the one that shows the highest divergence with respect to mouse. To search for other differences among chromosomes, we examine the distribution of gene functions in human and mouse chromosomes using the Gene Ontology definitions. We found by correspondence analysis that among the strongest clusterings of gene functions in human chromosomes is a group of genes coding for DNA binding proteins in chromosome 19. Interestingly, chromosome 19 also has a very high GC content, a feature that has been proposed to promote an opening of the chromatin, thereby facilitating binding of proteins to the DNA helix. In the mouse genome, however, a similar aggregation of genes coding for DNA binding proteins and high GC content cannot be found. This suggests that the distribution of genes coding for DNA binding proteins and the variations of the chromatin accessibility to these proteins are different in the human and mouse genomes. It is likely that the overall high synonymous and intron rates in chromosome 19 are a by-product of the high GC content of this chromosome.}, keywords = {Base Composition, Base Composition: genetics, Chromatin, Chromatin: metabolism, Chromosomes, Computational Biology, Databases, DNA-Binding Proteins, DNA-Binding Proteins: genetics, DNA-Binding Proteins: metabolism, Evolution, Genetic, Genome, human, Humans, Introns, Introns: genetics, Models, Molecular, Multigene Family, Multigene Family: genetics, Pair 19, Pair 19: genetics, Phylogeny, Zinc Fingers, Zinc Fingers: genetics} } Comparison of the human and mouse genomes has revealed that significant variations in evolutionary rates exist among genomic regions and that a large part of this variation is interchromosomal. We confirm in this work, using a large collection of introns, that human chromosome 19 is the one that shows the highest divergence with respect to mouse. To search for other differences among chromosomes, we examine the distribution of gene functions in human and mouse chromosomes using the Gene Ontology definitions. We found by correspondence analysis that among the strongest clusterings of gene functions in human chromosomes is a group of genes coding for DNA binding proteins in chromosome 19. Interestingly, chromosome 19 also has a very high GC content, a feature that has been proposed to promote an opening of the chromatin, thereby facilitating binding of proteins to the DNA helix. In the mouse genome, however, a similar aggregation of genes coding for DNA binding proteins and high GC content cannot be found. This suggests that the distribution of genes coding for DNA binding proteins and the variations of the chromatin accessibility to these proteins are different in the human and mouse genomes. It is likely that the overall high synonymous and intron rates in chromosome 19 are a by-product of the high GC content of this chromosome. |
2002 |
Albà, M Mar, Laskowski, Roman A, Hancock, John M Detecting cryptically simple protein sequences using the SIMPLE algorithm. (Article) Bioinformatics (Oxford, England), 18 (5), pp. 672–8, 2002, ISSN: 1367-4803. (Abstract | Links | BibTeX | Tags: Algorithms, Amino Acid, Amino Acid Sequence, Amino Acid: genetics, Databases, Genetic, Genetic Variation, Internet, Minisatellite Repeats, Minisatellite Repeats: genetics, Models, Molecular Sequence Data, Protein, Protein: methods, Proteins, Proteins: chemistry, Repetitive Sequences, Saccharomyces cerevisiae, Saccharomyces cerevisiae: genetics, Sensitivity and Specificity, Sequence Analysis, Sequence Homology, Software, Statistical) @article{Alba2002, title = {Detecting cryptically simple protein sequences using the SIMPLE algorithm.}, author = {Albà, M Mar and Laskowski, Roman A and Hancock, John M}, url = {http://www.ncbi.nlm.nih.gov/pubmed/12050063}, issn = {1367-4803}, year = {2002}, date = {2002-01-01}, journal = {Bioinformatics (Oxford, England)}, volume = {18}, number = {5}, pages = {672--8}, abstract = {Low-complexity or cryptically simple sequences are widespread in protein sequences but their evolution and function are poorly understood. To date methods for the detection of low complexity in proteins have been directed towards the filtering of such regions prior to sequence homology searches but not to the analysis of the regions per se. However, many of these regions are encoded by non-repetitive DNA sequences and may therefore result from selection acting on protein structure and/or function.}, keywords = {Algorithms, Amino Acid, Amino Acid Sequence, Amino Acid: genetics, Databases, Genetic, Genetic Variation, Internet, Minisatellite Repeats, Minisatellite Repeats: genetics, Models, Molecular Sequence Data, Protein, Protein: methods, Proteins, Proteins: chemistry, Repetitive Sequences, Saccharomyces cerevisiae, Saccharomyces cerevisiae: genetics, Sensitivity and Specificity, Sequence Analysis, Sequence Homology, Software, Statistical} } Low-complexity or cryptically simple sequences are widespread in protein sequences but their evolution and function are poorly understood. To date methods for the detection of low complexity in proteins have been directed towards the filtering of such regions prior to sequence homology searches but not to the analysis of the regions per se. However, many of these regions are encoded by non-repetitive DNA sequences and may therefore result from selection acting on protein structure and/or function. |
Publication List
Amino Acid Animals Computational Biology Databases de novo gene DNA Evolution Genetic Genome Humans lncRNA Mice Molecular Molecular Sequence Data Nucleic Acid Proteins Proteins: chemistry Proteins: genetics Repetitive Sequences ribosome profiling RNA-Seq Sequence Analysis Sequence Homology transcriptomics yeast
2010 |
Heterogeneous patterns of gene-expression diversification in mammalian gene duplicates. (Article) Molecular biology and evolution, 27 (2), pp. 325–35, 2010, ISSN: 1537-1719. |
2004 |
Genome sequence of the Brown Norway rat yields insights into mammalian evolution. (Article) Nature, 428 (6982), pp. 493–521, 2004, ISSN: 1476-4687. |
Journal of molecular evolution, 59 (1), pp. 72–9, 2004, ISSN: 0022-2844. |
2002 |
Detecting cryptically simple protein sequences using the SIMPLE algorithm. (Article) Bioinformatics (Oxford, England), 18 (5), pp. 672–8, 2002, ISSN: 1367-4803. |