Large scale annotation of small proteins by ribosome profiling

We participate in a a new world-wide initiative for the large-scale annotation of small ORF translation events detected by ribosome profiling in the human genome. The initiative, led by researchers at Ensembl, Max Delbrück Center and Broad Institute, among others, provides a first list of 7,264 new translated ORFs, including many ORFs in long non-coding RNAs, as well as upstream ORFs (uORFs) in coding transcripts.

You can read the complete post here.

Preprint: Mudge, Ruiz-Orera, Prensner et al. A community-driven roadmap to advance research on translated open reading frames detected by Ribo-seq. bioRxiv June 10, 2021.

Leave a Comment

Filed under Uncategorized

Article in Ellipse by Will on his PhD journey in the lab

Will Blevins explains in an article in Ellipse his experience doing a PhD in the lab. The first challenge was culturing different yeast species, and isolating the RNA, in Lucas Carey’s lab. Then Will had to built a de novo transcript assembly pipeline that would allow us to recover novel transcripts in a reliable manner, the use of spike-ins – a set of RNAs of known concentration – was key for this. Also important was to be able to do ribosome profiling experiments in the same conditions as the RNA-Seq, thanks to a collaboration with Juana Díez’s lab. This was followed by multitude of analyses to make sense of the data and finally.. the paper in Nature Communications!

Article in Ellipse: Uncovering de novo gene birth in yeast using deep transcriptomics.


Students from the Computational Genomics labs at the Research Program, Will is at the far end.

Leave a Comment

Filed under Uncategorized

Protein translation from uORFs: roles in stress

The current view of an mRNA is that of a central coding sequence (CDS) flanked by 5′ and 3′ untranslated regions (UTRs). But often UTRs contain open reading frames which, as revealed by ribosome profiling, can also be translated. The effect of these upstream and downstream ORFs (uORFs and dORFs) on the translation of the CDS, or on the production of micropeptides, is still largely unknown.

We have investigated uORF translation in yeast using ribosome profiling data from three different studies in which oxidative stress or starvation conditions were induced. During stress there is a general arrest of CDS translation. But surprisingly, we observe that uORF translation is much less affected, with the vast majority of genes showing an increase in the uORF to CDS translation ratio. Only in a specific subset of mRNAs this goes in the other direction; such regulatory uORFs decrease their translation during stress, permitting the efficient translation of the downstream CDS. The question remains as to the consequences of the increase translation of uORFs during stress, potentially generating hundreds of yet uncharacterized micropeptides. Are these small proteins of functional significance? And if so, how do they protect the cells from stress? New questions that will stimulate more research.

The article has been published in BMC Molecular Cell Biology:
Simone G. Moro, Cedric Hermans, Jorge Ruiz-Orera, M.Mar Albà. Impact of uORFs in mediating regulation of translation in stress conditions. BMC Mol Cell Biol 22, Article number: 29 (2021).

Leave a Comment

Filed under ribosome profiling, yeast

Uncovering de novo gene birth in yeast using deep transcriptomics

We have investigated de novo gene evolution in baker’s yeast by comparing the transcriptomes of 11 different yeast species and using ribosome profiling data to identify putatively translated ORFs. The results have now been published in Nature Communications, you can read the paper here:
Uncovering de novo gene birth in yeast using deep transcriptomics.

We have also written a Behind the Paper post in Nature Research Ecology and Evolution Community: Identifying recently evolved genes in yeast.
You can also read a comment about this article at Faculty Opinions.

A great collaboration between different labs at the PRBB!

Leave a Comment

Filed under Uncategorized

The unexplored world of non-canonical peptides – a new Issue in Experimental Cell Research

A new issue in Experimental Cell Research is dedicated to different aspects of an extensive group of proteins known as micropeptides, many of which still remain hidden in genomes. These peptides are translated from open reading frames shorter than 100 amino acids that are present in transcripts that are currently annotated as non-coding or in the 5′UTR of protein-coding genes. We have written a short piece about our favorite subject, how small ORFs in lncRNAs can evolve into new functional proteins.

You can read it all here: The hidden world of non-canonical ORFs

Leave a Comment

Filed under Uncategorized

Our research on de novo genes featured in Nature News

Research on de novo genes has been the subject of a News Feature in Nature, written by Adam Levy. The article presents the case of the arctic cod; comparison of genomic sequences from closely related fish species has shown that a new antifreeze protein has evolved from a non-coding genomic region. The birth of new genes de novo is a recent addition to the field of evolutionary biology. Until not long ago virtually all new genes were believed to originate from previously existing genes. However, studies performed in the past 15 years have accumulated growing evidence that a substantial fraction of the genes is likely to have arisen by other mechanisms. The articles covers some of the first studies on de novo gene birth and explains what we have learnt along the journey.

Read the article: How evolution builds genes from scratch, Nature Oct 16 2019.

Leave a Comment

Filed under de novo gene evolution, science, transcriptomics

Women in Computational Biology Conference

We are organizing the first Advances in Computational Biology in Barcelona in Nov 28-29 2019. One of the main purposes of the conference is to visualize and promote the research done by women scientists and for this reason, all presenters will be women, although the conference is open to everyone.

The programme will include poster and oral presentations, as well as keynotes from leading scientists in the computational biology and high-performance computing fields. The keynote speakers of the conference are: Christine Orengo from University College London, Natasa Przulj from the Barcelona Supercomputing Center and Marie-Christine Sawley, director of the Exascale Lab at Intel.

The chairs of the conference are: Alison Kennedy, director of the STFC Hartree Centre, Janet Kelso, group leader of the Minerva Research Group for Bioinformatics at the Max Planck Institute for Evolutionary Anthropology, and Nuria Lopez-Bigas, leader of the Biomedical Genomics Research Group at the Institute for Research in Biomedicine Barcelona.

Furthermore, the participants will have the opportunity to interact personally with female leaders in the fields of IT, academic research and politics that support the conference.

The conference is organised by the Bioinfo4Women programme from the Barcelona Supercomputing Center (BSC-CNS) with the collaboration of IMIM-UPF Research Programme on Biomedical Informatics (GRIB), the Spanish National Bioinformatics Institute (INB/ELIXIR-ES) and the Universitat Politècnica de Catalunya (UPC). It is an affiliate conference of the International Society for Computational Biology (ISCB).

Leave a Comment

Filed under Meetings

Thousands of small ORFs are translated, what are they doing?

The high throughput sequencing of ribosome-protected RNA fragments, or ribosome profiling (Ribo-Seq), has uncovered the translation of thousands of novel small ORFs (< 100 amino acids) that were not annotated. These ORFs had remained hidden from annotation pipelines because of their small size, similar to that of randomly occurring ORFs in the genome. Some of these peptides show strong evolutionary conservation and have been found to play roles in development or other cellular processes. Others are located upstream of a main ORF and are translated in specific circumstances, inhibiting the translation of the main protein product.

The translation of small ORFs can also be a step towards the birth of novel protein-coding transcripts. In recent years evidence has accumulated that some protein-coding genes have originated de novo from previously non-coding genomic sequences. This requires some degree of indiscriminate transcription and translation to generate precursors. In line with this we have found that many of the mouse-specific translated small ORFs appear to evolve under no selection (Ruiz-Orera et al., 2018). This finding defies the long-held notion that any protein that is produced must be functional.

In Translation of Small Open Reading Frames: Roles in Regulation and Evolutionary Innovation (Ruiz-Orera & Albà, Trends in Genetics 2019) we review what is currently known about the small ORFome.

Leave a Comment

Filed under de novo gene evolution, proteomics, ribosome profiling, science

Using ribosome profiling to improve our understanding of gene regulation

To measure changes in the expression of the genes we normally compare mRNA abundances using high throughput RNA sequencing data (RNA-Seq), as a proxy for the changes in the encoded proteins. However, it is well known that the correlation between mRNA and protein levels is far from perfect. Ideally, we would like to measure changes in the proteins themselves. The problem is that proteomics-based techniques are less sensitive, and less reproducible, than high throughput RNA sequencing. How can we get closer to the protein world while maintaining high sensitivity and specificity?

In a new study we propose to use Ribo-Seq instead RNA-Seq to perform differential gene expression analysis. Ribo-Seq is an RNA sequencing approach that specifically targets ribosome-protected RNA fragments and which shows higher correlation with proteomics data than RNA-Seq. We use this novel approach to study the response to oxidative stress in baker’s yeast. We show that the majority of genes that appear to be differentially expressed using RNA-Seq are not recovered with the Ribo-Seq-based analysis, strongly suggesting that many of these changes are not linked to changes in protein expression. This study highlights the advantages of using Ribo-Seq to understand not only translational regulation but also gene expression changes in general.

Please note that this study is now published in Scientific Reports. Follow this link to see the final version!

Leave a Comment

Filed under differential gene expression, proteomics, ribosome profiling, transcriptomics, yeast

Translation of neutrally evolving peptides a basis for de novo gene evolution – a short history

Our new paper “Translation of neutrally evolving peptides provides a basis for de novo gene evolution” has been published in Nature Ecology and Evolution on March 19 2018.

During the course of evolution, some genes are gained and others are lost. A well-established mechanism for the emergence of new genes is gene duplication. However, there is increasing evidence that some genes have not originated by gene duplication but de novo from previously non-coding regions of the genome. 

The two processes can be distinguished using sequence comparisons of closely related species. In gene duplication, the new gene retains sequence similarity to the other gene copy. In contrast, genes evolved de novo show no sequence similarity to other genes. In both cases, new genes initially appear by accident. A fraction of these genes will turn out to be beneficial and be subsequently maintained by natural selection.

My interest in new genes started more than fifteen years ago. At that time, I was building a database of herpesvirus protein families at University College London. When I tried to cluster the proteins into families, some would just not cluster. These proteins had unique sequences, they did not resemble any other viral or host protein, yet they performed essential functions. Improbable as it seemed, they had to have originated from DNA sequences other than genes.

Back in Barcelona I teamed up with Jose Castresana to study gene evolution in mammals. In a paper published in 2005 we described many human and mouse proteins that lacked homologues in non-mammalian species. Following the current thinking at the time we proposed that many of them could have been generated by very rapid evolution after gene duplication. However, we also argued that it was possible that some of them had evolved de novo. The reason was that the coding sequences of the young genes were unusually small and this is something one expects for randomly occurring open reading frames but not for functional gene duplicates. Then, Macarena Toll-Riera joined the lab as a PhD student and we decided to revisit this question. With more genomes at hand, the hypothesis of de novo gene birth gained strength. The results were published in 2009 in a paper entitled Origin of primate orphan genes: a comparative genomics approach.

Things became exciting again when Nicholas Ingolia and co-workers reported, in 2011, widespread translation of the mouse transcriptome, including many transcripts previously believed to be non-coding. Jorge Ruiz-Orera, a new PhD in the lab, examined ribosome profiling data from different species and found clear support for the pervasive translation of the transcriptome.

In the present study we have found that an important fraction of the translated peptides show no evolutionary conservation and evolve under no constraints. These peptides can be “tested” for new functions and eventually become new functional proteins, providing a basis for de novo gene evolution. More details of this study can be found here and in the Nature Ecology and Evolution community blog.

Mar Albà

Leave a Comment

Filed under Uncategorized