2018 |
Jorge Ruiz-Orera, Pol Grau-Verdaguer, José Luis Villanueva-Cañas, Xavier Messeguer, M.Mar Albà Translation of neutrally evolving peptides provides a basis for de novo gene evolution (Article) Nature Ecology and Evolution, 2 pp. 890–896, 2018. (Abstract | Links | BibTeX | Tags: codon usage bias, de novo gene, natural selection, ribosome profiling) @article{Ruiz-Orera2018, title = {Translation of neutrally evolving peptides provides a basis for de novo gene evolution}, author = {Jorge Ruiz-Orera, Pol Grau-Verdaguer, José Luis Villanueva-Cañas, Xavier Messeguer, M.Mar Albà}, url = {https://www.nature.com/articles/s41559-018-0506-6}, year = {2018}, date = {2018-03-19}, journal = {Nature Ecology and Evolution}, volume = {2}, pages = {890–896}, abstract = {Accumulating evidence indicates that some protein-coding genes have originated de novo from previously non-coding genomic sequences. However, the processes underlying de novo gene birth are still enigmatic. In particular, the appearance of a new functional protein seems highly improbable unless there is already a pool of neutrally evolving peptides that are translated at significant levels and that can at some point acquire new functions. Here, we use deep ribosome-profiling sequencing data, together with proteomics and single nucleotide polymorphism information, to search for these peptides. We find hundreds of open reading frames that are translated and that show no evolutionary conservation or selective constraints. These data suggest that the translation of these neutrally evolving peptides may be facilitated by the chance occurrence of open reading frames with a favourable codon composition. We conclude that the pervasive translation of the transcriptome provides plenty of material for the evolution of new functional proteins.}, keywords = {codon usage bias, de novo gene, natural selection, ribosome profiling} } Accumulating evidence indicates that some protein-coding genes have originated de novo from previously non-coding genomic sequences. However, the processes underlying de novo gene birth are still enigmatic. In particular, the appearance of a new functional protein seems highly improbable unless there is already a pool of neutrally evolving peptides that are translated at significant levels and that can at some point acquire new functions. Here, we use deep ribosome-profiling sequencing data, together with proteomics and single nucleotide polymorphism information, to search for these peptides. We find hundreds of open reading frames that are translated and that show no evolutionary conservation or selective constraints. These data suggest that the translation of these neutrally evolving peptides may be facilitated by the chance occurrence of open reading frames with a favourable codon composition. We conclude that the pervasive translation of the transcriptome provides plenty of material for the evolution of new functional proteins. |
2017 |
Jorge Ruiz-Orera, José Luis Villanueva-Cañas, William Blevins, M.Mar Albà De novo gene evolution: How do we transition from non-coding to coding? (Conference) PeerJ preprints 5 (e3031v2), 2017, (The SMBE 2017 Collection). (Abstract | Links | BibTeX | Tags: de novo gene, long non-coding RNA, Ribo-Seq, ribosome profiling) @conference{Ruiz-Orera2017, title = {De novo gene evolution: How do we transition from non-coding to coding?}, author = {Jorge Ruiz-Orera, José Luis Villanueva-Cañas, William Blevins, M.Mar Albà}, url = {https://doi.org/10.7287/peerj.preprints.3031v2}, year = {2017}, date = {2017-06-28}, journal = {PeerJ Preprints}, volume = {PeerJ preprints 5}, number = {e3031v2}, abstract = {Recent years have witnessed the discovery of protein–coding genes which appear to have evolved de novo from previously non-coding sequences. This has changed the long-standing view that coding sequences can only evolve from other coding sequences. However, there are still many open questions regarding how new protein-coding sequences can arise from non-genic DNA. Two prerequisites for the birth of a new functional protein-coding gene are that the corresponding DNA fragment is transcribed and that it is also translated. Transcription is known to be pervasive in the genome, producing a large number of transcripts that do not correspond to conserved protein-coding genes, and which are usually annotated as long non-coding RNAs (lncRNA). Recently, sequencing of ribosome protected fragments (Ribo-Seq) has provided evidence that many of these transcripts actually translate small proteins. We have used mouse non-synonymous and synonymous variation data to estimate the strength of purifying selection acting on the translated open reading frames (ORFs). Whereas a subset of the lncRNAs are likely to actually be true protein-coding genes (and thus previously misclassified), the bulk of lncRNAs code for proteins which show variation patterns consistent with neutral evolution. We also show that the ORFs that have a more favorable, coding-like, sequence composition are more likely to be translated than other ORFs in lncRNAs. This study provides strong evidence that there is a large and ever-changing reservoir of lowly abundant proteins; some of these peptides may become useful and act as seeds for de novo gene evolution.}, note = {The SMBE 2017 Collection}, keywords = {de novo gene, long non-coding RNA, Ribo-Seq, ribosome profiling} } Recent years have witnessed the discovery of protein–coding genes which appear to have evolved de novo from previously non-coding sequences. This has changed the long-standing view that coding sequences can only evolve from other coding sequences. However, there are still many open questions regarding how new protein-coding sequences can arise from non-genic DNA. Two prerequisites for the birth of a new functional protein-coding gene are that the corresponding DNA fragment is transcribed and that it is also translated. Transcription is known to be pervasive in the genome, producing a large number of transcripts that do not correspond to conserved protein-coding genes, and which are usually annotated as long non-coding RNAs (lncRNA). Recently, sequencing of ribosome protected fragments (Ribo-Seq) has provided evidence that many of these transcripts actually translate small proteins. We have used mouse non-synonymous and synonymous variation data to estimate the strength of purifying selection acting on the translated open reading frames (ORFs). Whereas a subset of the lncRNAs are likely to actually be true protein-coding genes (and thus previously misclassified), the bulk of lncRNAs code for proteins which show variation patterns consistent with neutral evolution. We also show that the ORFs that have a more favorable, coding-like, sequence composition are more likely to be translated than other ORFs in lncRNAs. This study provides strong evidence that there is a large and ever-changing reservoir of lowly abundant proteins; some of these peptides may become useful and act as seeds for de novo gene evolution. |
2016 |
Jorge Ruiz-Orera, Pol Verdaguer-Grau, José Luis Villanueva-Cañas, Xavier Messeguer, M Mar Albà Functional and non-functional classes of peptides produced by long non-coding RNAs (Article) bioRxiv, 2016, ISBN: http://dx.doi.org/10.1101/064915 . (Abstract | Links | BibTeX | Tags: long non-coding RNA, micropeptide, mouse, ribosome profiling, smORF, translation) @article{Ruiz-Orera2016, title = {Functional and non-functional classes of peptides produced by long non-coding RNAs}, author = {Jorge Ruiz-Orera, Pol Verdaguer-Grau, José Luis Villanueva-Cañas, Xavier Messeguer, M Mar Albà}, url = {http://biorxiv.org/content/early/2016/07/21/064915}, isbn = {http://dx.doi.org/10.1101/064915 }, year = {2016}, date = {2016-07-21}, journal = {bioRxiv}, abstract = {Cells express thousands of transcripts that show weak coding potential. Known as long non-coding RNAs (lncRNAs), they typically contain short open reading frames (ORFs) having no homology with known proteins. Recent studies have reported that a significant proportion of lncRNAs are translated, challenging the view that they are essentially non-coding. These results are based on the selective sequencing of ribosome-protected fragments, or ribosome profiling. The present study used ribosome profiling data from eight mouse tissues and cell types, combined with ~330,000 synonymous and non-synonymous single nucleotide variants, to dissect the biological implications of lncRNA translation. Using the three-nucleotide read periodicity that characterizes actively translated regions, we found that about 23% of the transcribed lncRNAs was translated (1,365 out of 6,390). About one fourth of the translated sequences (350 lncRNAs) showed conservation in humans; this is likely to produce functional micropeptides, including the recently discovered myoregulin. For other lncRNAs, the ORF codon usage bias distinguishes between two classes. The first has significant coding scores and contains functional proteins which are not conserved in humans. The second large class, comprising >500 lncRNAs, produces proteins that show no significant purifying selection signatures. We showed that the neutral translation of these lncRNAs depends on the transcript expression level and the chance occurrence of ORFs with a favorable codon composition. This provides the first evidence to data that many lncRNAs produce non-functional proteins.}, keywords = {long non-coding RNA, micropeptide, mouse, ribosome profiling, smORF, translation} } Cells express thousands of transcripts that show weak coding potential. Known as long non-coding RNAs (lncRNAs), they typically contain short open reading frames (ORFs) having no homology with known proteins. Recent studies have reported that a significant proportion of lncRNAs are translated, challenging the view that they are essentially non-coding. These results are based on the selective sequencing of ribosome-protected fragments, or ribosome profiling. The present study used ribosome profiling data from eight mouse tissues and cell types, combined with ~330,000 synonymous and non-synonymous single nucleotide variants, to dissect the biological implications of lncRNA translation. Using the three-nucleotide read periodicity that characterizes actively translated regions, we found that about 23% of the transcribed lncRNAs was translated (1,365 out of 6,390). About one fourth of the translated sequences (350 lncRNAs) showed conservation in humans; this is likely to produce functional micropeptides, including the recently discovered myoregulin. For other lncRNAs, the ORF codon usage bias distinguishes between two classes. The first has significant coding scores and contains functional proteins which are not conserved in humans. The second large class, comprising >500 lncRNAs, produces proteins that show no significant purifying selection signatures. We showed that the neutral translation of these lncRNAs depends on the transcript expression level and the chance occurrence of ORFs with a favorable codon composition. This provides the first evidence to data that many lncRNAs produce non-functional proteins. |
2015 |
Ruiz-Orera, Jorge, Hernandez-Rodriguez, Jessica, Chiva, Cristina, Sabidó, Eduard, Kondova, Ivanela, Bontrop, Ronald, Marqués-Bonet, Tomàs, Albà, M.Mar Origins of de novo genes in human and chimpanzee (Article) Plos Genetics, 11 (12), pp. e1005721, 2015. (Links | BibTeX | Tags: chimpanzee, de novo gene, Evolution, Humans, lncRNA, Promoter, proteomics, ribosome profiling, RNA-Seq, transcription factor binding site, transcriptomics) @article{Ruiz-Orera2015b, title = {Origins of de novo genes in human and chimpanzee}, author = {Ruiz-Orera, Jorge, Hernandez-Rodriguez, Jessica, Chiva, Cristina, Sabidó, Eduard, Kondova, Ivanela, Bontrop, Ronald, Marqués-Bonet, Tomàs, Albà, M.Mar}, url = {http://journals.plos.org/plosgenetics/article?id=10.1371/journal.pgen.1005721}, year = {2015}, date = {2015-12-31}, journal = {Plos Genetics}, volume = {11}, number = {12}, pages = {e1005721}, keywords = {chimpanzee, de novo gene, Evolution, Humans, lncRNA, Promoter, proteomics, ribosome profiling, RNA-Seq, transcription factor binding site, transcriptomics} } |
Publication List
Amino Acid Animals Computational Biology Databases de novo gene DNA Evolution Gene Duplication Genetic Genome Humans Mice Molecular Molecular Sequence Data Nucleic Acid Proteins Proteins: chemistry Proteins: genetics Rats Repetitive Sequences Selection Sequence Analysis Sequence Homology Software transcriptomics
2018 |
Translation of neutrally evolving peptides provides a basis for de novo gene evolution (Article) Nature Ecology and Evolution, 2 pp. 890–896, 2018. |
2017 |
De novo gene evolution: How do we transition from non-coding to coding? (Conference) PeerJ preprints 5 (e3031v2), 2017, (The SMBE 2017 Collection). |
2016 |
Functional and non-functional classes of peptides produced by long non-coding RNAs (Article) bioRxiv, 2016, ISBN: http://dx.doi.org/10.1101/064915 . |
2015 |
Origins of de novo genes in human and chimpanzee (Article) Plos Genetics, 11 (12), pp. e1005721, 2015. |