How can new genes arise de novo?

Some days ago we presented the results of our most recent work on the evolution of new genes at SMBE 2015. In the talk “The link between pervasive transcription and de novo gene evolution” we discussed the possible mechanisms of formation of new genes from regions in the genome which did not previously express any gene. The study was based on transcriptomics data from several mammalian species and focused on genes detected only in the human and chimp lineages. We found that genomic regions that express new genes have gained new regulatory motifs with respect to the corresponding regions in species that do not express the genes. This is consistent with the idea that the formation of new promoters can drive the birth of new genes.

See Haldane’s sieve post for the abstract and link to the preprint Origins of de novo genes in human and chimpanzee. The abstract has been among the Most viewed on Haldane’s sieve July 2015.
Press: Biotech-Spain.

Leave a Comment

Filed under de novo gene evolution, Uncategorized

Why are we so few?

The issue of the under-representation of women in permanent academic positions continues to be the subject of many public and informal discussions. This week, the sexist comments by a Nobel Laureate in a public conference have made it to the newspapers and spurred the debate. To make matters worse, it appears that female academics are not immune to gender bias and are still more likely to hire a John than a Jessica with identical CV, according to a recent study. This is depressing and it clearly shows that there is still a long way to go.

I have to admit that I would probably not be writing about this were not for a seemingly irrelevant event that happened to me recently. We published a study on the evolution of gene duplicates that was signed by six female scientists from two collaborating groups. I felt kind of proud to see the long list of women authors but at the same time I knew something was wrong. It was too unusual.

I have been a researcher in molecular biology for more than 20 years, mostly working in Barcelona and London. When I started there were clearly less female Principal Investigators (PIs) than male PIs and this is exactly how it continues to be. In contrast, the number of PhD students was, and is, much more balanced. We all know this, we only need to look at our surroundings. What it means is that women leave the scientific career at intermediate stages more often than their male counterparts.

The reasons why women are under-represented in top research or academic positions are probably very similar to the reasons why they are under-represented in positions of power or prestige from other fields. There is a historical trend that is proving very hard to erode. Besides, and this is a problem that affects us all, the current evaluation system is strongly based on the number of publications, as discussed here. This “more is better” system is detrimental to quality, causing a decrease in the percentage of influential papers and penalizing more strongly the individuals who wish or need to take career breaks, including maternity leaves. Quotas in conferences, committes, etc. can help making women more visible but in my view the changes needed are more profound.

Leave a Comment

Filed under science, society

Increased gain of “low complexity” regulatory domains in duplicated transcription factors

How do duplicated transcription factors, which are initially identical in sequence, specialize to perform diverse functions? We have team up with the group of Susana de la Luna, at the Center for Genomic Regulation (CRG), to try to answer this question from a new perspective.

We have focused on low-complexity regions (LCRs), such as alanine or glutamine repeat expansions, because they can accumulate rapidly during evolution and may result in changes in protein-protein interactions. By studying 237 gene families originated during the two rounds of whole genome duplication at the basis of the vertebrates we have found that duplicated gene copies have acquired many more LCRs than single copy genes evolved during the same period of time. By performing experiments in two different gene families (PHOX2A/B and LHX2/9) we have shown that the gain of novel alanine-rich LCRs can increase 3-4 fold the capacity of the protein to activate transcription. The study highlights the importance of LCRs in mediating the functional diversification of duplication transcription factors.

Reference: Núria Radó-Trilla , Krisztina Arató , Cinta Pegueroles , Alicia Raya , Susana de la Luna, M.Mar Albà. Key role of amino acid repeat expansions in the functional diversification of duplicated transcription factors. bioRxiv http://dx.doi.org/10.1101/014910.@biorxivpreprint

Update: published in MBE here.

Leave a Comment

Filed under amino acid repeat, gene duplication

PRBB open day: the link to society

On October 4 2014 the PRBB held its annual Open Day. The doors opened to visitors and scientists explained their work. Will and José Luis from our group volunteered to talk about genomics and the research they do in a way that it could be easily understood by everyone. Curious? here is their presentation.


Leave a Comment

Filed under science, society

“lncRNAs as a source of new peptides” among the most viewed posts in Haldane’s sieve 2014

Our post on the paper “Long non-coding RNAs as a source of new peptides” was among the most viewed in Haldane’s sieve during 2014. Following publication of the preprint in Arxiv and the related post in May 2014, it became very popular in twitter and social networks. The work was subsequently published in eLife. We are going to present this work and our most recent research at several Universities and conferences this year, including Ghent University N2N seminar series.

Leave a Comment

Filed under de novo gene evolution, lncRNA

Talking science at schools

This is the second year our group participates in the Science Week organized by the Catalan Local Goverment (Nov 14-23). The event consists in talking about our experience as researchers to a large group of students in a Secondary School. Last year Mar visited a school in Badia del Vallès, one of the most densely populated neighborhoods in the periphery of Barcelona (“Badia City”) and this year Jose Luis has been to a school in Granollers (in the picture), a city located 25 Km away from Barcelona.

Leave a Comment

Filed under education, science, society

On homology searches by protein Blast and the characterization of the age of genes (revisited)

This month a paper that investigates the power of sequence similarity searches by BLAST to classify genes into different age classes (phylostratigraphy), Phylostratigraphic bias creates spurious patterns of genome evolution (Moyers and Zhang, Uni Michigan) states that the method substantially underestimates gene age for a considerable fraction of genes and creates spurious and unpredictable patterns. Ummh.. how does this affect previous studies?

I am not new at this. The study here is very similar to one we conducted in 2007, On homology searches by protein Blast and the characterization of the age of genes. We found that the lack of sensitivity of BLAST only affected a small percentage of proteins (4.7%) and that it did not invalidate the previously reported finding that recently emerged genes evolve more rapidly than older ones (Alba and Castresana, 2005).

So? Are the results of this study different from those back then? Well, not much really. The authors of the present paper find that in 13.85% of the cases a homolog of the protein (Drosophila) was not detected in the most distant taxa (Bacteria). As in our study we did not consider Bacteria but Eukaryota (Fungi, Plants) as the most distant taxa, the equivalent figure here is about 9% (from Figure 5). The underestimation of the age mainly affects distant comparisons (>500 Mya). And again, the patterns obtained with the simulated data do not recapitulate the observations with real data.

After this publication it is even clearer that the large number of recently originated genes that are being detected in many species cannot be explained by problems of BLAST but it is a genuine pattern. Which is the role of these genes in the generation of intra-specific variability and the evolution of new biological traits? Back to work.

Mar Albà

Leave a Comment

Filed under de novo gene evolution, Papers

“Long non-coding RNAs as a source of new peptides” published in eLife

Abstract Deep transcriptome sequencing has revealed the existence of many transcripts that lack long or conserved open reading frames (ORFs) and which have been termed long non-coding RNAs (lncRNAs). The vast majority of lncRNAs are lineage-specific and do not yet have a known function. In this study, we test the hypothesis that they may act as a repository for the synthesis of new peptides. We find that a large fraction of the lncRNAs expressed in cells from six different species is associated with ribosomes. The patterns of ribosome protection are consistent with the translation of short peptides. lncRNAs show similar coding potential and sequence constraints than evolutionary young protein coding sequences, indicating that they play an important role in de novo protein evolution.

Read more..

To get a quick idea Haldane’s Sieve post on preprint.
For non-experts eLife summary.
For the details see the complete paper Ruiz-Orera, J, Messeguer, X, Subirana JA, Albà MM.Long non-coding RNAs as a source of new peptides. eLife 2014;3:e03523.

17 September 2014 Institution Press release
Related:
G+ Biology
TheScientist Sep 18 Picks
El.lipse Oct 2014

Leave a Comment

Filed under de novo gene evolution, lncRNA, Papers

Interview to Magda Gayà-Vidal about her work on primate comparative genomics

On August 29 Magda Gayà-Vidal was interviewed at El Punt Avui TV to talk about the results obtained in her master thesis carried out in the Evolutionary Genomics group. She explained the main findings in the paper, including the identification of ~200 genes that have evolved more rapidly in humans than in other primates, and discussed other more geneal aspects of scientific research.

Article ref.: Gayà-Vidal M & Albà MM (2014). Uncovering adaptive evolution in the human lineage. BMC Genomics 15:599.

The TV interview (in Catalan)

Leave a Comment

Filed under Papers, Video

Using human molecular data to dig into the distant past

A recent study by Magdalena Gayà-Vidal and Mar Albà, at the Biomedical Informatics Research Program (GRIB, IMIM-UPF), has investigated how we can use human genetic data to learn about mutations that might have conferred a selective advantage to humans in the past 5 Million years of evolution. The results have been published in BMC Genomics (July 16 2014): Uncovering adaptive evolution in the human lineage.

The availability of genetic variants from a large number of individuals, through initiatives such as the 1000 Genomes Project, is not only useful to understand the genetic basis of disease but also to gain a new insight into human evolution. Variation data provides us with a measure of the proportion of amino acid changes that a protein can tolerate whilst conserving its function. This is important because we can compare this value to the number of changes in the same protein during the evolution of humans away from the common primate ancestor. If we observe more changes than expected we can predict that there has been a rapid fixation of advantageous mutations by positive selection. Using protein coding sequences from human, chimpanzee, macaque and mouse the study has identified nearly 200 genes that have evolved more rapidly in humans than in other primates and which are enriched in positively selected sites. The list includes several genes encoding neural proteins.

News at IMIM

Leave a Comment

Filed under Papers