New genes and functional innovation in mammals

Many human genes have counterparts in distant species such as plants or bacteria. This is because they share a common origin, they were invented a long time ago in a primitive cell. However, there are some genes that do not have counterparts in other species, or only in a few of them. These genes have been born much more recently. Although they may have appeared by accident, some have acquired useful functions and been preserved by natural selection. We have recently compiled thousands of mammalian-specific gene families and asked which functions they perform. We have found an enrichment in proteins from the immune system, milk, skin and the germ cells. The most recent genes, however, are rarely functionally characterized. The results of this work provide new insights into how new genes originate and what they are selected for.
Read our paper at bioRxiv and tell us what you think!
See the final paper publication in Genome Biology and Evolution. News at IMIM here.

Gene families restricted to mammals

The numbers in the nodes of the tree indicate the number of gene families identified.

1 Comment

Filed under de novo gene evolution, gene duplication, mammal, Papers

Our group portrayed at El.lipse

Nov 2016

Leave a Comment

Filed under de novo gene evolution, differential gene expression, science, society

Pervasive translation of lncRNAS

Ribosome profiling is a sequencing tecnique that detects regions in mRNAs that are being translated. Using this technique, researchers have observed mysterious patterns of translation in many transcripts believed to be non-coding (lncRNAs, or long non-coding RNAs). The patterns are very similar to those observed in protein-coding genes but the translated proteins are generally smaller. Aside from their sequence, we know nothing about these peptides. Are they functional? Do they reflect some background noise of the translation machinery?

In a recent study published in bioRxiv we have investigated the signatures of selection in proteins translated from lncRNAs, using phylogenetic conservation and single nucleotide polymorphism (SNP) data. We have found that hundreds of mouse lncRNAs produce short functional proteins and thus should be considered protein coding genes. However, the largest part of translated lncRNAs appears to correspond to non-functional peptides. We conclude that, translation, like transcription, is pervasive. Due to this activity many peptides can be tested for new functions, facilitating the birth of new genes de novo.

This preprint was selected by the NODE (July 2016). It has also appeared at redcedar PRBB blog. The work was presented at XXI Evolution and Population Genetics Seminar Oct 3-5 2016 Sitges (Barcelona).

1 Comment

Filed under Uncategorized

Gene regulation in a hibernating primate

We have published the first study on the molecular processes underlying primate hibernation. The study is the result of a collaboration between researchers at IMIM (Hospital del Mar Medical Research Institute, Barcelona) and at Duke University and Duke Lemur Center (Durham,USA). The work is based on the fat-tailed dwarf lemur (Cheirogaleus medius), an extraordinary primate that is capable of enduring torpor (hibernation) for several months, subsisting only on the lipids stored in its tail. The project has used high throughput RNA sequencing (RNAseq) data to learn about the changes in gene expression in white adipose tissue during hibernation.

Reference: Faherty, S., Villanueva-Cañas, J.L. et al. Genome Biology and Evolution 2016

Related links:
IMIM press release
Duke Lemur Center
Sheena’s web page
Scientific American
El Periódico

Leave a Comment

Filed under differential gene expression, hibernation, Papers, RNA-Seq

Our group at Saló de l’Ensenyament (Education Fair)

How can we analyze genomes? What is junk DNA? Why is bioinformatics useful? Today, members from our group have been trying to explain these questions to the visitors of the Education Fair. The stand included a very realistic piece of “recycled” DNA and 3D printed protein structures.

IMIM-Bionformatics at GRIB
Saló de L’Ensenyament, Barcelona

Leave a Comment

Filed under science, society

When we fail to detect homologues in other species, is it because they are too divergent or because they do not exist?

The increasing number of genomes available has made it possible to compare the genes and determine in which branch of the phylogenetic tree they are likely to have originated. This has led to the identification of many genes that are species or lineage-specific. As they have no homologues in other species they must have originated from previously non-genic parts of the genome, or de novo. However, some researchers have claimed that errors in the detection of homologues by sequence similarity search methods, such as BLAST, may largely explain this. One way to assess how many genes are missed in these searches is to perform sequence evolution simulations along a phylogenetic tree and then use BLAST to recover the homologues (Albà and Castresana, 2007). If we fail to detect them we can say we have a sensitivity problem. This will result in a percentage of the genes being misclassified in younger classes.

The simulations performed to date have all indicated that the percentage of error for proteins is relatively small (4.7% to 13.85%) even at long distances (from mammals to fungi or plants). As expected, the problem is worse for distant comparisons than for closer ones. For example human and macaque, separated some 24 Millions of years ago, only display 6 substitutions every 100 nucleotides. Lack of BLAST sensitivity is not going to be a problem for these species even when comparing neutrally evolving sequences. For more distant comparisons it depends on whether the sequence is under selection or not. Proteins tend to contain motifs that are highly conserved and for this reason BLAST works reasonably well even at long distances. The results of the simulations support the idea that many genes are likely to have originated recently. For example only 14 S.cerevisiae proteins would fail to find homologues in S.paradoxus or S.mikatae due to BLAST errors (Moyers and Zhang, 2016). Although this is interpreted by the authors of the paper as problematic, the strong contrast with the observed data (445 genes restricted to these species in Carvunis et al.,2012) supports the notion that new genes are continuously emerging.

Mar Albà
Update: A reply to Moyers & Zhang has been published in bioRxiv No evidence for phylostratigraphic bias impacting inferences on patterns of gene emergence and evolution

Leave a Comment

Filed under de novo gene evolution, science, society

“Origins of de novo genes in human and chimpanzee” published in Plos Genetics

Novel genes are continuously emerging during evolution, but what drives this process? We have published a study in PLOS Genetics in which we find that the fortuitous appearance of certain combinations of elements in the genome can lead to the generation of new genes. The work, Origins of de novo genes in human and chimpanzee, is very similar to the one we published in arXiv some months ago. It includes some improvements resulting from the peer-review process and from having had more time to think about the paper.

In every genome, there are sets of genes, which are unique to that particular species. In this study, we first identified thousands of genes that were specific to human or chimpanzee. Then, we searched the macaque genome and discovered that this species had significantly less element motifs in the corresponding genomic sequences. These motifs are recognized by proteins that activate gene expression, a necessary step in the formation of a new gene.

The formation of genes de novo from previously non-active parts of the genome was, until recently, considered highly improbable. This study has shown that the mutations that occur normally in our genetic material may be sufficient to explain how this happens. Once expressed, the genes can act as a substrate for the evolution of new molecular functions. This study identified several candidate human proteins that bear no resemblance to any other known protein but which contain signatures of purifying selection.


Jorge Ruiz-Orera, Jessica Hernandez-Rodriguez, Cristina Chiva, Eduard Sabidó, Ivanela Kondova, Ronald Bontrop, Tomàs Marqués-Bonet, M.Mar Albà. Origins of De Novo Genes in Human and Chimpanzee. PLOS Genetics, 2015; 11 (12): e1005721.

Leave a Comment

Filed under de novo gene evolution, lncRNA, Papers, science

Bioinformatics for all

Several volunteers from GRIB explained what bionformatics is to non-experts during the 2015 PRBB open day on Oct 18 2015.

Will Blevins and José Luis Villanueva from the Evolutionary Genomics group tried to convey the concept of “orphan genes” using wooden pieces representing genes. Below Will with the Prezi presentation.

Leave a Comment

Filed under de novo gene evolution, education, science, society

Quanta Magazine article on de novo genes

De novo genes are genes that do not arise from gene duplication but from previously non-genic regions in the genome. These genes started to be detected about 10 years ago and have gained increased recognition as an important component of evolutionary innovation. A historical account on the discovery of de novo genes has been published in Quanta Magazine on August 18 2015 How new genes arise from scratch. The author, Emily Singer, was at the SMBE 2015 meeting in Vienna and attended the Symposium Origins and Evolution of Molecular Innovation, in which some of the latest developments in the field were presented. After interviewing some of the key players she has produced this report that summarizes the main turning points and the challenges ahead. Exciting to see this all this taking shape after not always easy beginnings!

Other links: Scientific American;Quanta Podcast.

Leave a Comment

Filed under de novo gene evolution

How can new genes arise de novo?

Some days ago we presented the results of our most recent work on the evolution of new genes at SMBE 2015. In the talk “The link between pervasive transcription and de novo gene evolution” we discussed the possible mechanisms of formation of new genes from regions in the genome which did not previously express any gene. The study was based on transcriptomics data from several mammalian species and focused on genes detected only in the human and chimp lineages. We found that genomic regions that express new genes have gained new regulatory motifs with respect to the corresponding regions in species that do not express the genes. This is consistent with the idea that the formation of new promoters can drive the birth of new genes.

See Haldane’s sieve post for the abstract and link to the preprint Origins of de novo genes in human and chimpanzee. The abstract has been among the Most viewed on Haldane’s sieve July 2015.
Press: Biotech-Spain.

Leave a Comment

Filed under de novo gene evolution, Uncategorized