Amino Acid Animals Computational Biology Databases de novo gene DNA Evolution Genetic Genome human Humans Mice Molecular Molecular Sequence Data Proteins Proteins: chemistry Proteins: genetics Repetitive Sequences ribosome profiling RNA-Seq Selection Sequence Analysis Sequence Homology transcriptomics yeast
2004 |
Albà, M Mar, Guigó, Roderic Comparative analysis of amino acid repeats in rodents and humans. (Article) Genome research, 14 (4), pp. 549–54, 2004, ISSN: 1088-9051. (Abstract | Links | BibTeX | Tags: Amino Acid, Amino Acid: genetics, Amino Acid: physiology, Animals, Chromosome Mapping, Chromosome Mapping: methods, Chromosome Mapping: statistics & numerical data, Computational Biology, Computational Biology: methods, Computational Biology: statistics & numerical data, GC Rich Sequence, GC Rich Sequence: genetics, Humans, Mice, Proteins, Proteins: chemistry, Proteins: genetics, Proteins: physiology, Rats, Repetitive Sequences, Trinucleotide Repeats, Trinucleotide Repeats: genetics) @article{Alba2004, title = {Comparative analysis of amino acid repeats in rodents and humans.}, author = {Albà, M Mar and Guigó, Roderic}, url = {http://www.pubmedcentral.nih.gov/articlerender.fcgi?artid=383298&tool=pmcentrez&rendertype=abstract}, issn = {1088-9051}, year = {2004}, date = {2004-01-01}, journal = {Genome research}, volume = {14}, number = {4}, pages = {549--54}, abstract = {Amino acid tandem repeats, also called homopolymeric tracts, are extremely abundant in eukaryotic proteins. To gain insight into the genome-wide evolution of these regions in mammals, we analyzed the repeat content in a large data set of rat-mouse-human orthologs. Our results show that human proteins contain more amino acid repeats than rodent proteins and that trinucleotide repeats are also more abundant in human coding sequences. Using the human species as an outgroup, we were able to address differences in repeat loss and repeat gain in the rat and mouse lineages. In this data set, mouse proteins contain substantially more repeats than rat proteins, which can be at least partly attributed to a higher repeat loss in the rat lineage. The data are consistent with a role for trinucleotide slippage in the generation of novel amino acid repeats. We confirm the previously observed functional bias of proteins with repeats, with overrepresentation of transcription factors and DNA-binding proteins. We show that genes encoding amino acid repeats tend to have an unusually high GC content, and that differences in coding GC content among orthologs are directly related to the presence/absence of repeats. We propose that the different GC content isochore structure in rodents and humans may result in an increased amino acid repeat prevalence in the human lineage.}, keywords = {Amino Acid, Amino Acid: genetics, Amino Acid: physiology, Animals, Chromosome Mapping, Chromosome Mapping: methods, Chromosome Mapping: statistics & numerical data, Computational Biology, Computational Biology: methods, Computational Biology: statistics & numerical data, GC Rich Sequence, GC Rich Sequence: genetics, Humans, Mice, Proteins, Proteins: chemistry, Proteins: genetics, Proteins: physiology, Rats, Repetitive Sequences, Trinucleotide Repeats, Trinucleotide Repeats: genetics} } Amino acid tandem repeats, also called homopolymeric tracts, are extremely abundant in eukaryotic proteins. To gain insight into the genome-wide evolution of these regions in mammals, we analyzed the repeat content in a large data set of rat-mouse-human orthologs. Our results show that human proteins contain more amino acid repeats than rodent proteins and that trinucleotide repeats are also more abundant in human coding sequences. Using the human species as an outgroup, we were able to address differences in repeat loss and repeat gain in the rat and mouse lineages. In this data set, mouse proteins contain substantially more repeats than rat proteins, which can be at least partly attributed to a higher repeat loss in the rat lineage. The data are consistent with a role for trinucleotide slippage in the generation of novel amino acid repeats. We confirm the previously observed functional bias of proteins with repeats, with overrepresentation of transcription factors and DNA-binding proteins. We show that genes encoding amino acid repeats tend to have an unusually high GC content, and that differences in coding GC content among orthologs are directly related to the presence/absence of repeats. We propose that the different GC content isochore structure in rodents and humans may result in an increased amino acid repeat prevalence in the human lineage. |