The increasing number of genomes available has made it possible to compare the genes and determine in which branch of the phylogenetic tree they are likely to have originated. This has led to the identification of many genes that are species or lineage-specific. As they have no homologues in other species they must have originated from previously non-genic parts of the genome, or de novo. However, some researchers have claimed that errors in the detection of homologues by sequence similarity search methods, such as BLAST, may largely explain this. One way to assess how many genes are missed in these searches is to perform sequence evolution simulations along a phylogenetic tree and then use BLAST to recover the homologues (Albà and Castresana, 2007). If we fail to detect them we can say we have a sensitivity problem. This will result in a percentage of the genes being misclassified in younger classes.
The simulations performed to date have all indicated that the percentage of error for proteins is relatively small (4.7% to 13.85%) even at long distances (from mammals to fungi or plants). As expected, the problem is worse for distant comparisons than for closer ones. For example human and macaque, separated some 24 Millions of years ago, only display 6 substitutions every 100 nucleotides. Lack of BLAST sensitivity is not going to be a problem for these species even when comparing neutrally evolving sequences. For more distant comparisons it depends on whether the sequence is under selection or not. Proteins tend to contain motifs that are highly conserved and for this reason BLAST works reasonably well even at long distances. The results of the simulations support the idea that many genes are likely to have originated recently. For example only 14 S.cerevisiae proteins would fail to find homologues in S.paradoxus or S.mikatae due to BLAST errors (Moyers and Zhang, 2016). Although this is interpreted by the authors of the paper as problematic, the strong contrast with the observed data (445 genes restricted to these species in Carvunis et al.,2012) supports the notion that new genes are continuously emerging.
Update: A reply to Moyers & Zhang has been published in bioRxiv No evidence for phylostratigraphic bias impacting inferences on patterns of gene emergence and evolution
Novel genes are continuously emerging during evolution, but what drives this process? We have published a study in PLOS Genetics in which we find that the fortuitous appearance of certain combinations of elements in the genome can lead to the generation of new genes. The work, Origins of de novo genes in human and chimpanzee, is very similar to the one we published in arXiv some months ago. It includes some improvements resulting from the peer-review process and from having had more time to think about the paper.
In every genome, there are sets of genes, which are unique to that particular species. In this study, we first identified thousands of genes that were specific to human or chimpanzee. Then, we searched the macaque genome and discovered that this species had significantly less element motifs in the corresponding genomic sequences. These motifs are recognized by proteins that activate gene expression, a necessary step in the formation of a new gene.
The formation of genes de novo from previously non-active parts of the genome was, until recently, considered highly improbable. This study has shown that the mutations that occur normally in our genetic material may be sufficient to explain how this happens. Once expressed, the genes can act as a substrate for the evolution of new molecular functions. This study identified several candidate human proteins that bear no resemblance to any other known protein but which contain signatures of purifying selection.
Jorge Ruiz-Orera, Jessica Hernandez-Rodriguez, Cristina Chiva, Eduard Sabidó, Ivanela Kondova, Ronald Bontrop, Tomàs Marqués-Bonet, M.Mar Albà. Origins of De Novo Genes in Human and Chimpanzee. PLOS Genetics, 2015; 11 (12): e1005721.
Several volunteers from GRIB explained what bionformatics is to non-experts during the 2015 PRBB open day on Oct 18 2015.
Will Blevins and José Luis Villanueva from the Evolutionary Genomics group tried to convey the concept of “orphan genes” using wooden pieces representing genes. Below Will with the Prezi presentation.
De novo genes are genes that do not arise from gene duplication but from previously non-genic regions in the genome. These genes started to be detected about 10 years ago and have gained increased recognition as an important component of evolutionary innovation. A historical account on the discovery of de novo genes has been published in Quanta Magazine on August 18 2015 How new genes arise from scratch. The author, Emily Singer, was at the SMBE 2015 meeting in Vienna and attended the Symposium Origins and Evolution of Molecular Innovation, in which some of the latest developments in the field were presented. After interviewing some of the key players she has produced this report that summarizes the main turning points and the challenges ahead. Exciting to see this all this taking shape after not always easy beginnings!
Other links: Scientific American;Quanta Podcast.
Some days ago we presented the results of our most recent work on the evolution of new genes at SMBE 2015. In the talk “The link between pervasive transcription and de novo gene evolution” we discussed the possible mechanisms of formation of new genes from regions in the genome which did not previously express any gene. The study was based on transcriptomics data from several mammalian species and focused on genes detected only in the human and chimp lineages. We found that genomic regions that express new genes have gained new regulatory motifs with respect to the corresponding regions in species that do not express the genes. This is consistent with the idea that the formation of new promoters can drive the birth of new genes.
See Haldane’s sieve post for the abstract and link to the preprint Origins of de novo genes in human and chimpanzee. The abstract has been among the Most viewed on Haldane’s sieve July 2015.
The issue of the under-representation of women in permanent academic positions continues to be the subject of many public and informal discussions. This week, the sexist comments by a Nobel Laureate in a public conference have made it to the newspapers and spurred the debate. To make matters worse, it appears that female academics are not immune to gender bias and are still more likely to hire a John than a Jessica with identical CV, according to a recent study. This is depressing and it clearly shows that there is still a long way to go.
I have to admit that I would probably not be writing about this were not for a seemingly irrelevant event that happened to me recently. We published a study on the evolution of gene duplicates that was signed by six female scientists from two collaborating groups. I felt kind of proud to see the long list of women authors but at the same time I knew something was wrong. It was too unusual.
I have been a researcher in molecular biology for more than 20 years, mostly working in Barcelona and London. When I started there were clearly less female Principal Investigators (PIs) than male PIs and this is exactly how it continues to be. In contrast, the number of PhD students was, and is, much more balanced. We all know this, we only need to look at our surroundings. What it means is that women leave the scientific career at intermediate stages more often than their male counterparts.
The reasons why women are under-represented in top research or academic positions are probably very similar to the reasons why they are under-represented in positions of power or prestige from other fields. There is a historical trend that is proving very hard to erode. Besides, and this is a problem that affects us all, the current evaluation system is strongly based on the number of publications, as discussed here. This “more is better” system is detrimental to quality, causing a decrease in the percentage of influential papers and penalizing more strongly the individuals who wish or need to take career breaks, including maternity leaves. Quotas in conferences, committes, etc. can help making women more visible but in my view the changes needed are more profound.
Filed under science, society
How do duplicated transcription factors, which are initially identical in sequence, specialize to perform diverse functions? We have team up with the group of Susana de la Luna, at the Center for Genomic Regulation (CRG), to try to answer this question from a new perspective.
We have focused on low-complexity regions (LCRs), such as alanine or glutamine repeat expansions, because they can accumulate rapidly during evolution and may result in changes in protein-protein interactions. By studying 237 gene families originated during the two rounds of whole genome duplication at the basis of the vertebrates we have found that duplicated gene copies have acquired many more LCRs than single copy genes evolved during the same period of time. By performing experiments in two different gene families (PHOX2A/B and LHX2/9) we have shown that the gain of novel alanine-rich LCRs can increase 3-4 fold the capacity of the protein to activate transcription. The study highlights the importance of LCRs in mediating the functional diversification of duplication transcription factors.
Reference: Núria Radó-Trilla , Krisztina Arató , Cinta Pegueroles , Alicia Raya , Susana de la Luna, M.Mar Albà. Key role of amino acid repeat expansions in the functional diversification of duplicated transcription factors. bioRxiv http://dx.doi.org/10.1101/014910.@biorxivpreprint
Update: published in MBE here.
On October 4 2014 the PRBB held its annual Open Day. The doors opened to visitors and scientists explained their work. Will and José Luis from our group volunteered to talk about genomics and the research they do in a way that it could be easily understood by everyone. Curious? here is their presentation.
Filed under science, society
Our post on the paper “Long non-coding RNAs as a source of new peptides” was among the most viewed in Haldane’s sieve during 2014. Following publication of the preprint in Arxiv and the related post in May 2014, it became very popular in twitter and social networks. The work was subsequently published in eLife. We are going to present this work and our most recent research at several Universities and conferences this year, including Ghent University N2N seminar series.
This is the second year our group participates in the Science Week organized by the Catalan Local Goverment (Nov 14-23). The event consists in talking about our experience as researchers to a large group of students in a Secondary School. Last year Mar visited a school in Badia del Vallès, one of the most densely populated neighborhoods in the periphery of Barcelona (“Badia City”) and this year Jose Luis has been to a school in Granollers (in the picture), a city located 25 Km away from Barcelona.