2011 |
Toll-Riera, Macarena, Laurie, Steve, Albà, M Mar Lineage-specific variation in intensity of natural selection in mammals. (Article) Molecular biology and evolution, 28 (1), pp. 383–98, 2011, ISSN: 1537-1719. (Abstract | Links | BibTeX | Tags: Amino Acid Sequence, Amino Acid Substitution, Animals, Evolution, F-Box Proteins, F-Box Proteins: genetics, G-Protein-Coupled, G-Protein-Coupled: genetics, Genetic, Genetic Variation, Humans, Mammals, Mammals: genetics, Molecular, Molecular Sequence Data, N-Methyl-D-Aspartate, N-Methyl-D-Aspartate: genetics, Odorant, Odorant: genetics, Receptors, Selection, Sequence Alignment) @article{Toll-Riera2011a, title = {Lineage-specific variation in intensity of natural selection in mammals.}, author = {Toll-Riera, Macarena and Laurie, Steve and Albà, M Mar}, url = {http://www.ncbi.nlm.nih.gov/pubmed/20688808}, issn = {1537-1719}, year = {2011}, date = {2011-01-01}, journal = {Molecular biology and evolution}, volume = {28}, number = {1}, pages = {383--98}, abstract = {The molecular clock hypothesis states that protein-coding genes evolve at an approximately constant rate. However, this is only expected to be true as long as the function and the tertiary structure of the molecule remain unaltered. An important implication of this statement is that significant deviations in the rate of evolution of a gene with respect to the species clock are likely to reflect functional and/or structural alterations. Here, we present a method to identify such deviations and apply it to a data set of 2,929 high-quality coding sequence alignments corresponding to one-to-one orthologous genes from six mammalian species--human, macaque, mouse, rat, cow, and dog. Deviated branches are defined as those that present significant alterations in both the rate of nonsynonymous substitutions (dN) and the selective pressure (dN/dS). Strikingly, we find that as many as 24.5% of the genes show branch-specific deviations in dN and dN/dS, though this is a relatively well-conserved set of genes. Around half of these genes show branch-specific acceleration of evolutionary rates. Positive selection (PS) tests based on divergence data only identify 17.7% of the accelerated branches. Failure to identify PS in accelerated branches with an excess of radical amino acid replacements suggests that these tests are conservative. Interestingly, genes with accelerated branches are significantly enriched in neural proteins, indicating that this type of protein might play a more important role than previously thought in species diversification, although they are generally not detected by PS tests. We discuss in detail several examples of genes that show lineage-specific evolutionary rate acceleration and are involved in synaptic transmission, chemosensory perception, and ubiquitination.}, keywords = {Amino Acid Sequence, Amino Acid Substitution, Animals, Evolution, F-Box Proteins, F-Box Proteins: genetics, G-Protein-Coupled, G-Protein-Coupled: genetics, Genetic, Genetic Variation, Humans, Mammals, Mammals: genetics, Molecular, Molecular Sequence Data, N-Methyl-D-Aspartate, N-Methyl-D-Aspartate: genetics, Odorant, Odorant: genetics, Receptors, Selection, Sequence Alignment} } The molecular clock hypothesis states that protein-coding genes evolve at an approximately constant rate. However, this is only expected to be true as long as the function and the tertiary structure of the molecule remain unaltered. An important implication of this statement is that significant deviations in the rate of evolution of a gene with respect to the species clock are likely to reflect functional and/or structural alterations. Here, we present a method to identify such deviations and apply it to a data set of 2,929 high-quality coding sequence alignments corresponding to one-to-one orthologous genes from six mammalian species--human, macaque, mouse, rat, cow, and dog. Deviated branches are defined as those that present significant alterations in both the rate of nonsynonymous substitutions (dN) and the selective pressure (dN/dS). Strikingly, we find that as many as 24.5% of the genes show branch-specific deviations in dN and dN/dS, though this is a relatively well-conserved set of genes. Around half of these genes show branch-specific acceleration of evolutionary rates. Positive selection (PS) tests based on divergence data only identify 17.7% of the accelerated branches. Failure to identify PS in accelerated branches with an excess of radical amino acid replacements suggests that these tests are conservative. Interestingly, genes with accelerated branches are significantly enriched in neural proteins, indicating that this type of protein might play a more important role than previously thought in species diversification, although they are generally not detected by PS tests. We discuss in detail several examples of genes that show lineage-specific evolutionary rate acceleration and are involved in synaptic transmission, chemosensory perception, and ubiquitination. |
2010 |
Mularoni, Loris, Ledda, Alice, Toll-Riera, Macarena, Albà, M Mar Natural selection drives the accumulation of amino acid tandem repeats in human proteins. (Article) Genome research, 20 (6), pp. 745–54, 2010, ISSN: 1549-5469. (Abstract | Links | BibTeX | Tags: Amino Acid, Amino Acid Sequence, Amino Acids, Amino Acids: chemistry, Amino Acids: genetics, Animals, Genetic, Humans, Molecular Sequence Data, Proteins, Proteins: chemistry, Proteins: genetics, Repetitive Sequences, Selection, Sequence Homology) @article{Mularoni2010, title = {Natural selection drives the accumulation of amino acid tandem repeats in human proteins.}, author = {Mularoni, Loris and Ledda, Alice and Toll-Riera, Macarena and Albà, M Mar}, url = {http://www.pubmedcentral.nih.gov/articlerender.fcgi?artid=2877571&tool=pmcentrez&rendertype=abstract}, issn = {1549-5469}, year = {2010}, date = {2010-01-01}, journal = {Genome research}, volume = {20}, number = {6}, pages = {745--54}, abstract = {Amino acid tandem repeats are found in a large number of eukaryotic proteins. They are often encoded by trinucleotide repeats and exhibit high intra- and interspecies size variability due to the high mutation rate associated with replication slippage. The extent to which natural selection is important in shaping amino acid repeat evolution is a matter of debate. On one hand, their high frequency may simply reflect their high probability of expansion by slippage, and they could essentially evolve in a neutral manner. On the other hand, there is experimental evidence that changes in repeat size can influence protein-protein interactions, transcriptional activity, or protein subcellular localization, indicating that repeats could be functionally relevant and thus shaped by selection. To gauge the relative contribution of neutral and selective forces in amino acid repeat evolution, we have performed a comparative analysis of amino acid repeat conservation in a large set of orthologous proteins from 12 vertebrate species. As a neutral model of repeat evolution we have used sequences with the same DNA triplet composition as the coding sequences--and thus expected to be subject to the same mutational forces--but located in syntenic noncoding genomic regions. The results strongly indicate that selection has played a more important role than previously suspected in amino acid tandem repeat evolution, by increasing the repeat retention rate and by modulating repeat size. The data obtained in this study have allowed us to identify a set of 92 repeats that are postulated to play important functional roles due to their strong selective signature, including five cases with direct experimental evidence.}, keywords = {Amino Acid, Amino Acid Sequence, Amino Acids, Amino Acids: chemistry, Amino Acids: genetics, Animals, Genetic, Humans, Molecular Sequence Data, Proteins, Proteins: chemistry, Proteins: genetics, Repetitive Sequences, Selection, Sequence Homology} } Amino acid tandem repeats are found in a large number of eukaryotic proteins. They are often encoded by trinucleotide repeats and exhibit high intra- and interspecies size variability due to the high mutation rate associated with replication slippage. The extent to which natural selection is important in shaping amino acid repeat evolution is a matter of debate. On one hand, their high frequency may simply reflect their high probability of expansion by slippage, and they could essentially evolve in a neutral manner. On the other hand, there is experimental evidence that changes in repeat size can influence protein-protein interactions, transcriptional activity, or protein subcellular localization, indicating that repeats could be functionally relevant and thus shaped by selection. To gauge the relative contribution of neutral and selective forces in amino acid repeat evolution, we have performed a comparative analysis of amino acid repeat conservation in a large set of orthologous proteins from 12 vertebrate species. As a neutral model of repeat evolution we have used sequences with the same DNA triplet composition as the coding sequences--and thus expected to be subject to the same mutational forces--but located in syntenic noncoding genomic regions. The results strongly indicate that selection has played a more important role than previously suspected in amino acid tandem repeat evolution, by increasing the repeat retention rate and by modulating repeat size. The data obtained in this study have allowed us to identify a set of 92 repeats that are postulated to play important functional roles due to their strong selective signature, including five cases with direct experimental evidence. |
2007 |
Mularoni, Loris, Veitia, Reiner A, Albà, M Mar Highly constrained proteins contain an unexpectedly large number of amino acid tandem repeats. (Article) Genomics, 89 (3), pp. 316–25, 2007, ISSN: 0888-7543. (Abstract | Links | BibTeX | Tags: Amino Acid, Amino Acid Sequence, Animals, Complementary, Conserved Sequence, DNA, Evolution, Genetic, Humans, Mice, Molecular, Point Mutation, Proteins, Proteins: chemistry, Proteins: genetics, Repetitive Sequences, Selection, Trinucleotide Repeats) @article{Mularoni2007, title = {Highly constrained proteins contain an unexpectedly large number of amino acid tandem repeats.}, author = {Mularoni, Loris and Veitia, Reiner A and Albà, M Mar}, url = {http://www.ncbi.nlm.nih.gov/pubmed/17196365}, issn = {0888-7543}, year = {2007}, date = {2007-01-01}, journal = {Genomics}, volume = {89}, number = {3}, pages = {316--25}, abstract = {Single-amino-acid tandem repeats are very common in mammalian proteins but their function and evolution are still poorly understood. Here we investigate how the variability and prevalence of amino acid repeats are related to the evolutionary constraints operating on the proteins. We find a significant positive correlation between repeat size difference and protein nonsynonymous substitution rate in human and mouse orthologous genes. This association is observed for all the common amino acid repeat types and indicates that rapid diversification of repeat structures, involving both trinucleotide slippage and nucleotide substitutions, preferentially occurs in proteins subject to low selective constraints. However, strikingly, we also observe a significant negative correlation between the number of repeats in a protein and the gene nonsynonymous substitution rate, particularly for glutamine, glycine, and alanine repeats. This implies that proteins subject to strong selective constraints tend to contain an unexpectedly high number of repeats, which tend to be well conserved between the two species. This is consistent with a role for selection in the maintenance of a significant number of repeats. Analysis of the codon structure of the sequences encoding the repeats shows that codon purity is associated with high repeat size interspecific variability. Interestingly, polyalanine and polyglutamine repeats associated with disease show very distinctive features regarding the degree of repeat conservation and the protein sequence selective constraints.}, keywords = {Amino Acid, Amino Acid Sequence, Animals, Complementary, Conserved Sequence, DNA, Evolution, Genetic, Humans, Mice, Molecular, Point Mutation, Proteins, Proteins: chemistry, Proteins: genetics, Repetitive Sequences, Selection, Trinucleotide Repeats} } Single-amino-acid tandem repeats are very common in mammalian proteins but their function and evolution are still poorly understood. Here we investigate how the variability and prevalence of amino acid repeats are related to the evolutionary constraints operating on the proteins. We find a significant positive correlation between repeat size difference and protein nonsynonymous substitution rate in human and mouse orthologous genes. This association is observed for all the common amino acid repeat types and indicates that rapid diversification of repeat structures, involving both trinucleotide slippage and nucleotide substitutions, preferentially occurs in proteins subject to low selective constraints. However, strikingly, we also observe a significant negative correlation between the number of repeats in a protein and the gene nonsynonymous substitution rate, particularly for glutamine, glycine, and alanine repeats. This implies that proteins subject to strong selective constraints tend to contain an unexpectedly high number of repeats, which tend to be well conserved between the two species. This is consistent with a role for selection in the maintenance of a significant number of repeats. Analysis of the codon structure of the sequences encoding the repeats shows that codon purity is associated with high repeat size interspecific variability. Interestingly, polyalanine and polyglutamine repeats associated with disease show very distinctive features regarding the degree of repeat conservation and the protein sequence selective constraints. |
2006 |
Furney, Simon J, Albà, M Mar, López-Bigas, Núria BMC genomics, 7 pp. 165, 2006, ISSN: 1471-2164. (Abstract | Links | BibTeX | Tags: Amino Acid, Animals, Caenorhabditis elegans, Caenorhabditis elegans: genetics, Computational Biology, Conserved Sequence, Dominant, Essential, Evolution, Genes, Genetic, Genetic Diseases, Genetic Structures, Humans, Inborn, Inborn: classification, Inborn: genetics, Mice, Molecular, Mutation, Pan troglodytes, Pan troglodytes: genetics, Recessive, Selection, Sequence Homology) @article{Furney2006, title = {Differences in the evolutionary history of disease genes affected by dominant or recessive mutations.}, author = {Furney, Simon J and Albà, M Mar and López-Bigas, Núria}, url = {http://www.pubmedcentral.nih.gov/articlerender.fcgi?artid=1534034&tool=pmcentrez&rendertype=abstract}, issn = {1471-2164}, year = {2006}, date = {2006-01-01}, journal = {BMC genomics}, volume = {7}, pages = {165}, abstract = {Global analyses of human disease genes by computational methods have yielded important advances in the understanding of human diseases. Generally these studies have treated the group of disease genes uniformly, thus ignoring the type of disease-causing mutations (dominant or recessive). In this report we present a comprehensive study of the evolutionary history of autosomal disease genes separated by mode of inheritance.}, keywords = {Amino Acid, Animals, Caenorhabditis elegans, Caenorhabditis elegans: genetics, Computational Biology, Conserved Sequence, Dominant, Essential, Evolution, Genes, Genetic, Genetic Diseases, Genetic Structures, Humans, Inborn, Inborn: classification, Inborn: genetics, Mice, Molecular, Mutation, Pan troglodytes, Pan troglodytes: genetics, Recessive, Selection, Sequence Homology} } Global analyses of human disease genes by computational methods have yielded important advances in the understanding of human diseases. Generally these studies have treated the group of disease genes uniformly, thus ignoring the type of disease-causing mutations (dominant or recessive). In this report we present a comprehensive study of the evolutionary history of autosomal disease genes separated by mode of inheritance. |
2004 |
Huang, Hui, Winter, Eitan E, Wang, Huajun, Weinstock, Keith G, Xing, Heming, Goodstadt, Leo, Stenson, Peter D, Cooper, David N, Smith, Douglas, Albà, M Mar, Ponting, Chris P, Fechtel, Kim Genome biology, 5 (7), pp. R47, 2004, ISSN: 1465-6914. (Abstract | Links | BibTeX | Tags: Amino Acid, Amino Acid: genetics, Animal, Animals, Chromosome Mapping, Chromosome Mapping: methods, Conserved Sequence, Conserved Sequence: genetics, Disease Models, Evolution, Fishes, Fishes: genetics, Fungal, Fungal: genetics, Genes, Genes: genetics, Genes: physiology, Genetic, Genetic Diseases, Genome, Helminth, Helminth: genetics, human, Humans, Inborn, Inborn: genetics, Inborn: physiopathology, Insect, Insect: genetics, Mice, Molecular, Mutagenesis, Mutagenesis: genetics, Nucleic Acid, Nucleotides, Nucleotides: genetics, Point Mutation, Point Mutation: genetics, Rats, Repetitive Sequences, Selection, Sequence Homology, Trinucleotide Repeat Expansion, Trinucleotide Repeat Expansion: genetics) @article{Huang2004, title = {Evolutionary conservation and selection of human disease gene orthologs in the rat and mouse genomes.}, author = {Huang, Hui and Winter, Eitan E and Wang, Huajun and Weinstock, Keith G and Xing, Heming and Goodstadt, Leo and Stenson, Peter D and Cooper, David N and Smith, Douglas and Albà, M Mar and Ponting, Chris P and Fechtel, Kim}, url = {http://www.pubmedcentral.nih.gov/articlerender.fcgi?artid=463309&tool=pmcentrez&rendertype=abstract}, issn = {1465-6914}, year = {2004}, date = {2004-01-01}, journal = {Genome biology}, volume = {5}, number = {7}, pages = {R47}, abstract = {Model organisms have contributed substantially to our understanding of the etiology of human disease as well as having assisted with the development of new treatment modalities. The availability of the human, mouse and, most recently, the rat genome sequences now permit the comprehensive investigation of the rodent orthologs of genes associated with human disease. Here, we investigate whether human disease genes differ significantly from their rodent orthologs with respect to their overall levels of conservation and their rates of evolutionary change.}, keywords = {Amino Acid, Amino Acid: genetics, Animal, Animals, Chromosome Mapping, Chromosome Mapping: methods, Conserved Sequence, Conserved Sequence: genetics, Disease Models, Evolution, Fishes, Fishes: genetics, Fungal, Fungal: genetics, Genes, Genes: genetics, Genes: physiology, Genetic, Genetic Diseases, Genome, Helminth, Helminth: genetics, human, Humans, Inborn, Inborn: genetics, Inborn: physiopathology, Insect, Insect: genetics, Mice, Molecular, Mutagenesis, Mutagenesis: genetics, Nucleic Acid, Nucleotides, Nucleotides: genetics, Point Mutation, Point Mutation: genetics, Rats, Repetitive Sequences, Selection, Sequence Homology, Trinucleotide Repeat Expansion, Trinucleotide Repeat Expansion: genetics} } Model organisms have contributed substantially to our understanding of the etiology of human disease as well as having assisted with the development of new treatment modalities. The availability of the human, mouse and, most recently, the rat genome sequences now permit the comprehensive investigation of the rodent orthologs of genes associated with human disease. Here, we investigate whether human disease genes differ significantly from their rodent orthologs with respect to their overall levels of conservation and their rates of evolutionary change. |
Publication List
Amino Acid Animals Computational Biology Databases de novo gene Evolution Genetic Genome Humans lncRNA Mice Molecular Molecular Sequence Data Nucleic Acid Proteins Proteins: chemistry Proteins: genetics Repetitive Sequences ribosome profiling RNA-Seq Selection Sequence Analysis Sequence Homology transcriptomics yeast
2011 |
Lineage-specific variation in intensity of natural selection in mammals. (Article) Molecular biology and evolution, 28 (1), pp. 383–98, 2011, ISSN: 1537-1719. |
2010 |
Natural selection drives the accumulation of amino acid tandem repeats in human proteins. (Article) Genome research, 20 (6), pp. 745–54, 2010, ISSN: 1549-5469. |
2007 |
Highly constrained proteins contain an unexpectedly large number of amino acid tandem repeats. (Article) Genomics, 89 (3), pp. 316–25, 2007, ISSN: 0888-7543. |
2006 |
BMC genomics, 7 pp. 165, 2006, ISSN: 1471-2164. |
2004 |
Genome biology, 5 (7), pp. R47, 2004, ISSN: 1465-6914. |