Will Blevins and José Luis Villanueva from the Evolutionary Genomics group tried to convey the concept of “orphan genes” using wooden pieces representing genes. Below Will with the Prezi presentation.
De novo genes are genes that do not arise from gene duplication but from previously non-genic regions in the genome. These genes started to be detected about 10 years ago and have gained increased recognition as an important component of evolutionary innovation. A historical account on the discovery of de novo genes has been published in Quanta Magazine on August 18 2015 How new genes arise from scratch. The author, Emily Singer, was at the SMBE 2015 meeting in Vienna and attended the Symposium Origins and Evolution of Molecular Innovation, in which some of the latest developments in the field were presented. After interviewing some of the key players she has produced this report that summarizes the main turning points and the challenges ahead. Exciting to see this all this taking shape after not always easy beginnings!
Other links: Scientific American;Quanta Podcast.
Some days ago we presented the results of our most recent work on the evolution of new genes at SMBE 2015. In the talk “The link between pervasive transcription and de novo gene evolution” we discussed the possible mechanisms of formation of new genes from regions in the genome which did not previously express any gene. The study was based on transcriptomics data from several mammalian species and focused on genes detected only in the human and chimp lineages. We found that genomic regions that express new genes have gained new regulatory motifs with respect to the corresponding regions in species that do not express the genes. This is consistent with the idea that the formation of new promoters can drive the birth of new genes.
See Haldane’s sieve post for the abstract and link to the preprint Origins of de novo genes in human and chimpanzee. The abstract has been among the Most viewed on Haldane’s sieve July 2015.
The issue of the under-representation of women in permanent academic positions continues to be the subject of many public and informal discussions. This week, the sexist comments by a Nobel Laureate in a public conference have made it to the newspapers and spurred the debate. To make matters worse, it appears that female academics are not immune to gender bias and are still more likely to hire a John than a Jessica with identical CV, according to a recent study. This is depressing and it clearly shows that there is still a long way to go.
I have to admit that I would probably not be writing about this were not for a seemingly irrelevant event that happened to me recently. We published a study on the evolution of gene duplicates that was signed by six female scientists from two collaborating groups. I felt kind of proud to see the long list of women authors but at the same time I knew something was wrong. It was too unusual.
I have been a researcher in molecular biology for more than 20 years, mostly working in Barcelona and London. When I started there were clearly less female Principal Investigators (PIs) than male PIs and this is exactly how it continues to be. In contrast, the number of PhD students was, and is, much more balanced. We all know this, we only need to look at our surroundings. What it means is that women leave the scientific career at intermediate stages more often than their male counterparts.
The reasons why women are under-represented in top research or academic positions are probably very similar to the reasons why they are under-represented in positions of power or prestige from other fields. There is a historical trend that is proving very hard to erode. Besides, and this is a problem that affects us all, the current evaluation system is strongly based on the number of publications, as discussed here. This “more is better” system is detrimental to quality, causing a decrease in the percentage of influential papers and penalizing more strongly the individuals who wish or need to take career breaks, including maternity leaves. Quotas in conferences, committes, etc. can help making women more visible but in my view the changes needed are more profound.
How do duplicated transcription factors, which are initially identical in sequence, specialize to perform diverse functions? We have team up with the group of Susana de la Luna, at the Center for Genomic Regulation (CRG), to try to answer this question from a new perspective.
We have focused on low-complexity regions (LCRs), such as alanine or glutamine repeat expansions, because they can accumulate rapidly during evolution and may result in changes in protein-protein interactions. By studying 237 gene families originated during the two rounds of whole genome duplication at the basis of the vertebrates we have found that duplicated gene copies have acquired many more LCRs than single copy genes evolved during the same period of time. By performing experiments in two different gene families (PHOX2A/B and LHX2/9) we have shown that the gain of novel alanine-rich LCRs can increase 3-4 fold the capacity of the protein to activate transcription. The study highlights the importance of LCRs in mediating the functional diversification of duplication transcription factors.
Reference: Núria Radó-Trilla , Krisztina Arató , Cinta Pegueroles , Alicia Raya , Susana de la Luna, M.Mar Albà. Key role of amino acid repeat expansions in the functional diversification of duplicated transcription factors. bioRxiv http://dx.doi.org/10.1101/014910.@biorxivpreprint
Update: published in MBE here.
On October 4 2014 the PRBB held its annual Open Day. The doors opened to visitors and scientists explained their work. Will and José Luis from our group volunteered to talk about genomics and the research they do in a way that it could be easily understood by everyone. Curious? here is their presentation.
Our post on the paper “Long non-coding RNAs as a source of new peptides” was among the most viewed in Haldane’s sieve during 2014. Following publication of the preprint in Arxiv and the related post in May 2014, it became very popular in twitter and social networks. The work was subsequently published in eLife. We are going to present this work and our most recent research at several Universities and conferences this year, including Ghent University N2N seminar series.
This is the second year our group participates in the Science Week organized by the Catalan Local Goverment (Nov 14-23). The event consists in talking about our experience as researchers to a large group of students in a Secondary School. Last year Mar visited a school in Badia del Vallès, one of the most densely populated neighborhoods in the periphery of Barcelona (“Badia City”) and this year Jose Luis has been to a school in Granollers (in the picture), a city located 25 Km away from Barcelona.
This month a paper that investigates the power of sequence similarity searches by BLAST to classify genes into different age classes (phylostratigraphy), Phylostratigraphic bias creates spurious patterns of genome evolution (Moyers and Zhang, Uni Michigan) states that the method substantially underestimates gene age for a considerable fraction of genes and creates spurious and unpredictable patterns. Ummh.. how does this affect previous studies?
I am not new at this. The study here is very similar to one we conducted in 2007, On homology searches by protein Blast and the characterization of the age of genes. We found that the lack of sensitivity of BLAST only affected a small percentage of proteins (4.7%) and that it did not invalidate the previously reported finding that recently emerged genes evolve more rapidly than older ones (Alba and Castresana, 2005).
So? Are the results of this study different from those back then? Well, not much really. The authors of the present paper find that in 13.85% of the cases a homolog of the protein (Drosophila) was not detected in the most distant taxa (Bacteria). As in our study we did not consider Bacteria but Eukaryota (Fungi, Plants) as the most distant taxa, the equivalent figure here is about 9% (from Figure 5). The underestimation of the age mainly affects distant comparisons (>500 Mya). And again, the patterns obtained with the simulated data do not recapitulate the observations with real data.
After this publication it is even clearer that the large number of recently originated genes that are being detected in many species cannot be explained by problems of BLAST but it is a genuine pattern. Which is the role of these genes in the generation of intra-specific variability and the evolution of new biological traits? Back to work.
Abstract Deep transcriptome sequencing has revealed the existence of many transcripts that lack long or conserved open reading frames (ORFs) and which have been termed long non-coding RNAs (lncRNAs). The vast majority of lncRNAs are lineage-specific and do not yet have a known function. In this study, we test the hypothesis that they may act as a repository for the synthesis of new peptides. We find that a large fraction of the lncRNAs expressed in cells from six different species is associated with ribosomes. The patterns of ribosome protection are consistent with the translation of short peptides. lncRNAs show similar coding potential and sequence constraints than evolutionary young protein coding sequences, indicating that they play an important role in de novo protein evolution.
To get a quick idea Haldane’s Sieve post on preprint.
For non-experts eLife summary.
For the details see the complete paper Ruiz-Orera, J, Messeguer, X, Subirana JA, Albà MM.Long non-coding RNAs as a source of new peptides. eLife 2014;3:e03523.