Research lines

Temporal aspects of gene evolution

Different genes evolve at a different pace. We are investigating the causes of such variability, focusing on the effect of gene age and function on gene evolutionary rate. We have observed that genes of recent origin, such as primate- or mammalian-specific genes, are evolving much more rapidly than genes of ancient origin. This is consistent with a low degree of functional constraints near the time of birth of the gene. Intrigued by the mechanisms that enable the formation of new genes, we have analysed the features of primate-specific genes (genes lacking homolgoues in non-primate species). We have found that these genes are highly enriched in sequences derived from transposable elements, indicating that these elements might play an unsuspected important role in novel gene formation.

Selected Publications
- Toll-Riera, M., Bosch, N., Bellora, N., Castelo, R., Armengol,Ll., Estivill, X., Albà, M.M. (2009) Origin of primate orphan genes: a comparative genomics approach.Mol.Biol. Evol., 26: 603-612.
- Albà,M.M.,Castresana, J. (2007). On homology searches by protein Blast and the characterization of the age of genes. BMC Evol.Biol.,Vol. 7: 53.
- Albà,M.M.,Castresana, J. (2005). Inverse Relationship between Evolutionary Rate and Age of Mammalian Genes. Mol.Biol. Evol., Vol. 22: 598-606.
- Castresana, J., Guigó, R., Albà,M.M. (2004). Clustering of genes coding for DNA binding proteins in a region of atypical evolution of the human genome. J. Mol. Evol. 59: 72-79.

Role of low-complexity sequences in protein evolution

Low-complexity sequences, including homopolymeric tracts and other short amino acid tandem repeats, are extremely abundant in eukaryotic proteins. These sequences may expand or contract rapidly by the action of replication slippage and/or recombination. We are performing several analysis to try to understand which are the evolutionary dynamics of low-complexity sequences, and which is their contribution to protein function. We are using sets of repeats showing different degrees of conservation in vertebrate orthologous proteins to gain knowledge on these aspects. We are also investigating the impact of short insertions and deletions, and their relationship to repeats, in the evolution of mammalian proteins.

Selected Publications
- Mularoni, L., Ledda, A., Toll-Riera, M., Albà, M.M. (2010). Natural selection drives the accumulation of amino acid tandem repeats in human proteins. Genome Res., Advanced Online 24th March.
- Salichs, E., Ledda, A., Mularoni, L., Albà, M.M., de la Luna, S. (2009). Genome-wide analysis of histidine repeats reveals their role in the localization of human proteins to the nuclear speckles compartment. Plos Genet., Vol. 5:e1000397.
- Mularoni, L.,Veitia, R.A., Albà, M.M. (2007). Highly constrained proteins contain an unexpectedly large number of amino acid tandem repeats. Genomics, Vol. 89:316-325.
- Mularoni, L.,Guigó, R., Albà, M.M. (2006). Mutation patterns of amino acid tandem repeats in the human proteome. Genome Biol., Vol. 7:R33.
- Albà, M.M., Guigó, R. (2004). Comparative analysis of amino acid repeats in rodents and humans. Genome Res., Vol. 14: 549-554.

Analysis of gene expression regulatory sequences

Gene transcription regulatory sequences contain a complex arrangement of motifs that are recognized by transcription factors, many of which are located upstream from transcription start sites (promoters). We have shown that housekeeping genes show less conserved promoters than genes with tissue-specific expression, particularly upstream from position -500. This suggests that genes with constitutive expression require shorter functional promoters. We have also identified subsets of regulatory motifs which are over-represented in housekeeping or tissue-specific promoters. We have analysed motif positional bias in yeast and mammalian promoters with the in-house software PEAKS.

Selected Publications
- Farré, D., Albà,M.M. (2010). Heterogeneous patterns of gene expression diversification in mammalian gene duplicates. Mol. Biol. Evol.,Vol. 27: 325-335.
- Farré, D., Bellora, N., Mularoni, L., Messeguer, X., Albà,M.M. (2007). Housekeeping genes tend to show reduced upstream sequene conservation. Genome Biol.,Vol. 8: R140.
- Bellora, N., Farré, D., Albà,M.M. (2007). PEAKS: Identification of regulatory motifs by their position in DNA sequences. Bioinformatics,Vol. 23: 243-244.
- Farré, D., Roset, R., Huerta, M., Adsuara, J.E., Roselló, L., Albà, M.M., Messeguer, X. (2003). Identification of patterns in biological sequences at the ALGGEN server: PROMO and MALGEN. Nucl. Acids Res., Vol. 31:3651-3653.
- Messeguer, X., Escudero, R., Farré D., Núñez O., Martínez J., Albà, M.M. (2002). PROMO: detection of known transcription regulatory elements using species-tailored searches. Bioinformatics, Vol. 18:333-334.