2020 |
Jorge Ruiz-Orera, José Luis Villanueva-Cañas, M.Mar Albà Evolution of New Proteins From Translated sORFs in Long Non-Coding RNAs (Article) Experimental Cell Research, 391 (1), pp. 111940, 2020. (Abstract | Links | BibTeX | Tags: de novo gene, lncRNA, microprotein, ribosome profiling, sORF) @article{Ruiz-Orera2020, title = {Evolution of New Proteins From Translated sORFs in Long Non-Coding RNAs }, author = {Jorge Ruiz-Orera, José Luis Villanueva-Cañas, M.Mar Albà}, url = {https://www.sciencedirect.com/science/article/abs/pii/S0014482720301452?via%3Dihub}, year = {2020}, date = {2020-03-07}, journal = {Experimental Cell Research}, volume = {391}, number = {1}, pages = {111940}, abstract = {High throughput RNA sequencing techniques have revealed that a large fraction of the genome is transcribed into long non-coding RNAs (lncRNAs). Unlike canonical protein-coding genes, lncRNAs do not contain long open reading frames (ORFs) and tend to be poorly conserved across species. However, many of them contain small ORFs (sORFs) that exhibit translation signatures according to ribosome profiling or proteomics data. These sORFs are a source of putative novel proteins; some of them may confer a selective advantage and be maintained over time, a process known as de novo gene birth. Here we review the mechanisms by which randomly occurring sORFs in lncRNAs can become new functional proteins. }, keywords = {de novo gene, lncRNA, microprotein, ribosome profiling, sORF} } High throughput RNA sequencing techniques have revealed that a large fraction of the genome is transcribed into long non-coding RNAs (lncRNAs). Unlike canonical protein-coding genes, lncRNAs do not contain long open reading frames (ORFs) and tend to be poorly conserved across species. However, many of them contain small ORFs (sORFs) that exhibit translation signatures according to ribosome profiling or proteomics data. These sORFs are a source of putative novel proteins; some of them may confer a selective advantage and be maintained over time, a process known as de novo gene birth. Here we review the mechanisms by which randomly occurring sORFs in lncRNAs can become new functional proteins. |
2019 |
William R. Blevins, Teresa Tavella, Simone G. Moro, Bernat Blasco-Moreno, Adrià Closa-Mosquera, Juana Díez, Lucas B. Carey, M.Mar Albà Scientific Reports, 9 pp. 11005, 2019. (Links | BibTeX | Tags: oxidative stress, proteomics, ribosome profiling, RNA-Seq, translation regulation, yeast) @article{Blevins2019_2, title = {Extensive post-transcriptional buffering of gene expression in the response to severe oxidative stress in baker's yeast}, author = {William R. Blevins, Teresa Tavella, Simone G. Moro, Bernat Blasco-Moreno, Adrià Closa-Mosquera, Juana Díez, Lucas B. Carey, M.Mar Albà}, url = {https://www.nature.com/articles/s41598-019-47424-w}, year = {2019}, date = {2019-07-29}, journal = {Scientific Reports}, volume = {9}, pages = {11005}, keywords = {oxidative stress, proteomics, ribosome profiling, RNA-Seq, translation regulation, yeast} } |
Marina Reixachs-Sole, Jorge Ruiz-Orera, M.Mar Albà, Eduardo Eyras bioRxiv, March 19, 2019. (Abstract | Links | BibTeX | Tags: human, isoform, mouse, nervous system, ribosome profiling) @article{Reixachs-Sole2019, title = {Ribosome profiling at isoform level reveals an evolutionary conserved impact of differential splicing on the proteome}, author = {Marina Reixachs-Sole, Jorge Ruiz-Orera, M.Mar Albà, Eduardo Eyras}, url = {https://doi.org/10.1101/582031 }, year = {2019}, date = {2019-03-19}, journal = {bioRxiv, March 19}, abstract = {The differential production of transcript isoforms from gene loci is a key mechanism in multiple biological processes and pathologies. Although this has been exhaustively shown at RNA level, it remains elusive at protein level. Here, we describe a new pipeline ORQAS (ORF quantification pipeline for alternative splicing) for the translation quantification of individual transcript isoforms using ribosome-protected mRNA fragments (Ribosome profiling). We found evidence of translation for 40-50% of the expressed transcript isoforms in human and 50% in mouse, with 53% of the expressed genes having more than one translated isoform in human, and 33% in mouse. Differential analysis revealed that about 40% of the splicing changes measured at RNA level in human were concordant with changes in translation; and that 21.7% of changes measured at RNA level, and 17.8% at translation level, were conserved between human and mouse. Furthermore, orthologous cassette exons preserving the directionality of the change were found enriched in microexons in a comparison between glia and glioma in both, and were conserved between human and mouse.. In summary, we established a moderate but widespread impact of differential splicing in the translation of isoforms and found evidence of an impact on the translation of microexons as a consequence of differential splicing. ORQAS is available at https://github.com/comprna/orqas .}, keywords = {human, isoform, mouse, nervous system, ribosome profiling} } The differential production of transcript isoforms from gene loci is a key mechanism in multiple biological processes and pathologies. Although this has been exhaustively shown at RNA level, it remains elusive at protein level. Here, we describe a new pipeline ORQAS (ORF quantification pipeline for alternative splicing) for the translation quantification of individual transcript isoforms using ribosome-protected mRNA fragments (Ribosome profiling). We found evidence of translation for 40-50% of the expressed transcript isoforms in human and 50% in mouse, with 53% of the expressed genes having more than one translated isoform in human, and 33% in mouse. Differential analysis revealed that about 40% of the splicing changes measured at RNA level in human were concordant with changes in translation; and that 21.7% of changes measured at RNA level, and 17.8% at translation level, were conserved between human and mouse. Furthermore, orthologous cassette exons preserving the directionality of the change were found enriched in microexons in a comparison between glia and glioma in both, and were conserved between human and mouse.. In summary, we established a moderate but widespread impact of differential splicing in the translation of isoforms and found evidence of an impact on the translation of microexons as a consequence of differential splicing. ORQAS is available at https://github.com/comprna/orqas . |
2018 |
Jorge Ruiz-Orera, Pol Grau-Verdaguer, José Luis Villanueva-Cañas, Xavier Messeguer, M.Mar Albà Translation of neutrally evolving peptides provides a basis for de novo gene evolution (Article) Nature Ecology and Evolution, 2 pp. 890–896, 2018. (Abstract | Links | BibTeX | Tags: codon usage bias, de novo gene, natural selection, ribosome profiling) @article{Ruiz-Orera2018, title = {Translation of neutrally evolving peptides provides a basis for de novo gene evolution}, author = {Jorge Ruiz-Orera, Pol Grau-Verdaguer, José Luis Villanueva-Cañas, Xavier Messeguer, M.Mar Albà}, url = {https://www.nature.com/articles/s41559-018-0506-6}, year = {2018}, date = {2018-03-19}, journal = {Nature Ecology and Evolution}, volume = {2}, pages = {890–896}, abstract = {Accumulating evidence indicates that some protein-coding genes have originated de novo from previously non-coding genomic sequences. However, the processes underlying de novo gene birth are still enigmatic. In particular, the appearance of a new functional protein seems highly improbable unless there is already a pool of neutrally evolving peptides that are translated at significant levels and that can at some point acquire new functions. Here, we use deep ribosome-profiling sequencing data, together with proteomics and single nucleotide polymorphism information, to search for these peptides. We find hundreds of open reading frames that are translated and that show no evolutionary conservation or selective constraints. These data suggest that the translation of these neutrally evolving peptides may be facilitated by the chance occurrence of open reading frames with a favourable codon composition. We conclude that the pervasive translation of the transcriptome provides plenty of material for the evolution of new functional proteins.}, keywords = {codon usage bias, de novo gene, natural selection, ribosome profiling} } Accumulating evidence indicates that some protein-coding genes have originated de novo from previously non-coding genomic sequences. However, the processes underlying de novo gene birth are still enigmatic. In particular, the appearance of a new functional protein seems highly improbable unless there is already a pool of neutrally evolving peptides that are translated at significant levels and that can at some point acquire new functions. Here, we use deep ribosome-profiling sequencing data, together with proteomics and single nucleotide polymorphism information, to search for these peptides. We find hundreds of open reading frames that are translated and that show no evolutionary conservation or selective constraints. These data suggest that the translation of these neutrally evolving peptides may be facilitated by the chance occurrence of open reading frames with a favourable codon composition. We conclude that the pervasive translation of the transcriptome provides plenty of material for the evolution of new functional proteins. |
2017 |
Jorge Ruiz-Orera, José Luis Villanueva-Cañas, William Blevins, M.Mar Albà De novo gene evolution: How do we transition from non-coding to coding? (Conference) PeerJ preprints 5 (e3031v2), 2017, (The SMBE 2017 Collection). (Abstract | Links | BibTeX | Tags: de novo gene, long non-coding RNA, Ribo-Seq, ribosome profiling) @conference{Ruiz-Orera2017, title = {De novo gene evolution: How do we transition from non-coding to coding?}, author = {Jorge Ruiz-Orera, José Luis Villanueva-Cañas, William Blevins, M.Mar Albà}, url = {https://doi.org/10.7287/peerj.preprints.3031v2}, year = {2017}, date = {2017-06-28}, journal = {PeerJ Preprints}, volume = {PeerJ preprints 5}, number = {e3031v2}, abstract = {Recent years have witnessed the discovery of protein–coding genes which appear to have evolved de novo from previously non-coding sequences. This has changed the long-standing view that coding sequences can only evolve from other coding sequences. However, there are still many open questions regarding how new protein-coding sequences can arise from non-genic DNA. Two prerequisites for the birth of a new functional protein-coding gene are that the corresponding DNA fragment is transcribed and that it is also translated. Transcription is known to be pervasive in the genome, producing a large number of transcripts that do not correspond to conserved protein-coding genes, and which are usually annotated as long non-coding RNAs (lncRNA). Recently, sequencing of ribosome protected fragments (Ribo-Seq) has provided evidence that many of these transcripts actually translate small proteins. We have used mouse non-synonymous and synonymous variation data to estimate the strength of purifying selection acting on the translated open reading frames (ORFs). Whereas a subset of the lncRNAs are likely to actually be true protein-coding genes (and thus previously misclassified), the bulk of lncRNAs code for proteins which show variation patterns consistent with neutral evolution. We also show that the ORFs that have a more favorable, coding-like, sequence composition are more likely to be translated than other ORFs in lncRNAs. This study provides strong evidence that there is a large and ever-changing reservoir of lowly abundant proteins; some of these peptides may become useful and act as seeds for de novo gene evolution.}, note = {The SMBE 2017 Collection}, keywords = {de novo gene, long non-coding RNA, Ribo-Seq, ribosome profiling} } Recent years have witnessed the discovery of protein–coding genes which appear to have evolved de novo from previously non-coding sequences. This has changed the long-standing view that coding sequences can only evolve from other coding sequences. However, there are still many open questions regarding how new protein-coding sequences can arise from non-genic DNA. Two prerequisites for the birth of a new functional protein-coding gene are that the corresponding DNA fragment is transcribed and that it is also translated. Transcription is known to be pervasive in the genome, producing a large number of transcripts that do not correspond to conserved protein-coding genes, and which are usually annotated as long non-coding RNAs (lncRNA). Recently, sequencing of ribosome protected fragments (Ribo-Seq) has provided evidence that many of these transcripts actually translate small proteins. We have used mouse non-synonymous and synonymous variation data to estimate the strength of purifying selection acting on the translated open reading frames (ORFs). Whereas a subset of the lncRNAs are likely to actually be true protein-coding genes (and thus previously misclassified), the bulk of lncRNAs code for proteins which show variation patterns consistent with neutral evolution. We also show that the ORFs that have a more favorable, coding-like, sequence composition are more likely to be translated than other ORFs in lncRNAs. This study provides strong evidence that there is a large and ever-changing reservoir of lowly abundant proteins; some of these peptides may become useful and act as seeds for de novo gene evolution. |
2016 |
Jorge Ruiz-Orera, Pol Verdaguer-Grau, José Luis Villanueva-Cañas, Xavier Messeguer, M Mar Albà Functional and non-functional classes of peptides produced by long non-coding RNAs (Article) bioRxiv, 2016, ISBN: http://dx.doi.org/10.1101/064915 . (Abstract | Links | BibTeX | Tags: long non-coding RNA, micropeptide, mouse, ribosome profiling, smORF, translation) @article{Ruiz-Orera2016, title = {Functional and non-functional classes of peptides produced by long non-coding RNAs}, author = {Jorge Ruiz-Orera, Pol Verdaguer-Grau, José Luis Villanueva-Cañas, Xavier Messeguer, M Mar Albà}, url = {http://biorxiv.org/content/early/2016/07/21/064915}, isbn = {http://dx.doi.org/10.1101/064915 }, year = {2016}, date = {2016-07-21}, journal = {bioRxiv}, abstract = {Cells express thousands of transcripts that show weak coding potential. Known as long non-coding RNAs (lncRNAs), they typically contain short open reading frames (ORFs) having no homology with known proteins. Recent studies have reported that a significant proportion of lncRNAs are translated, challenging the view that they are essentially non-coding. These results are based on the selective sequencing of ribosome-protected fragments, or ribosome profiling. The present study used ribosome profiling data from eight mouse tissues and cell types, combined with ~330,000 synonymous and non-synonymous single nucleotide variants, to dissect the biological implications of lncRNA translation. Using the three-nucleotide read periodicity that characterizes actively translated regions, we found that about 23% of the transcribed lncRNAs was translated (1,365 out of 6,390). About one fourth of the translated sequences (350 lncRNAs) showed conservation in humans; this is likely to produce functional micropeptides, including the recently discovered myoregulin. For other lncRNAs, the ORF codon usage bias distinguishes between two classes. The first has significant coding scores and contains functional proteins which are not conserved in humans. The second large class, comprising >500 lncRNAs, produces proteins that show no significant purifying selection signatures. We showed that the neutral translation of these lncRNAs depends on the transcript expression level and the chance occurrence of ORFs with a favorable codon composition. This provides the first evidence to data that many lncRNAs produce non-functional proteins.}, keywords = {long non-coding RNA, micropeptide, mouse, ribosome profiling, smORF, translation} } Cells express thousands of transcripts that show weak coding potential. Known as long non-coding RNAs (lncRNAs), they typically contain short open reading frames (ORFs) having no homology with known proteins. Recent studies have reported that a significant proportion of lncRNAs are translated, challenging the view that they are essentially non-coding. These results are based on the selective sequencing of ribosome-protected fragments, or ribosome profiling. The present study used ribosome profiling data from eight mouse tissues and cell types, combined with ~330,000 synonymous and non-synonymous single nucleotide variants, to dissect the biological implications of lncRNA translation. Using the three-nucleotide read periodicity that characterizes actively translated regions, we found that about 23% of the transcribed lncRNAs was translated (1,365 out of 6,390). About one fourth of the translated sequences (350 lncRNAs) showed conservation in humans; this is likely to produce functional micropeptides, including the recently discovered myoregulin. For other lncRNAs, the ORF codon usage bias distinguishes between two classes. The first has significant coding scores and contains functional proteins which are not conserved in humans. The second large class, comprising >500 lncRNAs, produces proteins that show no significant purifying selection signatures. We showed that the neutral translation of these lncRNAs depends on the transcript expression level and the chance occurrence of ORFs with a favorable codon composition. This provides the first evidence to data that many lncRNAs produce non-functional proteins. |
2015 |
Ruiz-Orera, Jorge, Hernandez-Rodriguez, Jessica, Chiva, Cristina, Sabidó, Eduard, Kondova, Ivanela, Bontrop, Ronald, Marqués-Bonet, Tomàs, Albà, M.Mar Origins of de novo genes in human and chimpanzee (Article) Plos Genetics, 11 (12), pp. e1005721, 2015. (Links | BibTeX | Tags: chimpanzee, de novo gene, Evolution, Humans, lncRNA, Promoter, proteomics, ribosome profiling, RNA-Seq, transcription factor binding site, transcriptomics) @article{Ruiz-Orera2015b, title = {Origins of de novo genes in human and chimpanzee}, author = {Ruiz-Orera, Jorge, Hernandez-Rodriguez, Jessica, Chiva, Cristina, Sabidó, Eduard, Kondova, Ivanela, Bontrop, Ronald, Marqués-Bonet, Tomàs, Albà, M.Mar}, url = {http://journals.plos.org/plosgenetics/article?id=10.1371/journal.pgen.1005721}, year = {2015}, date = {2015-12-31}, journal = {Plos Genetics}, volume = {11}, number = {12}, pages = {e1005721}, keywords = {chimpanzee, de novo gene, Evolution, Humans, lncRNA, Promoter, proteomics, ribosome profiling, RNA-Seq, transcription factor binding site, transcriptomics} } |
Publication List
Amino Acid Animals Computational Biology Databases de novo gene DNA Evolution Genetic Genome human Humans Mice Molecular Molecular Sequence Data Proteins Proteins: chemistry Proteins: genetics Repetitive Sequences ribosome profiling RNA-Seq Selection Sequence Analysis Sequence Homology transcriptomics yeast
2020 |
Evolution of New Proteins From Translated sORFs in Long Non-Coding RNAs (Article) Experimental Cell Research, 391 (1), pp. 111940, 2020. |
2019 |
Scientific Reports, 9 pp. 11005, 2019. |
bioRxiv, March 19, 2019. |
2018 |
Translation of neutrally evolving peptides provides a basis for de novo gene evolution (Article) Nature Ecology and Evolution, 2 pp. 890–896, 2018. |
2017 |
De novo gene evolution: How do we transition from non-coding to coding? (Conference) PeerJ preprints 5 (e3031v2), 2017, (The SMBE 2017 Collection). |
2016 |
Functional and non-functional classes of peptides produced by long non-coding RNAs (Article) bioRxiv, 2016, ISBN: http://dx.doi.org/10.1101/064915 . |
2015 |
Origins of de novo genes in human and chimpanzee (Article) Plos Genetics, 11 (12), pp. e1005721, 2015. |