2006 |
Blanco, Enrique, Farré, Domènec, Albà, M Mar, Messeguer, Xavier, Guigó, Roderic ABS: a database of Annotated regulatory Binding Sites from orthologous promoters. (Article) Nucleic acids research, 34 (Database issue), pp. D63–7, 2006, ISSN: 1362-4962. (Abstract | Links | BibTeX | Tags: Animals, Binding Sites, Chickens, Chickens: genetics, Databases, Genetic, Genomics, Humans, Internet, Mice, Nucleic Acid, Promoter Regions, Rats, Transcription Factors, Transcription Factors: metabolism, User-Computer Interface) @article{Blanco2006, title = {ABS: a database of Annotated regulatory Binding Sites from orthologous promoters.}, author = {Blanco, Enrique and Farré, Domènec and Albà, M Mar and Messeguer, Xavier and Guigó, Roderic}, url = {http://www.pubmedcentral.nih.gov/articlerender.fcgi?artid=1347478&tool=pmcentrez&rendertype=abstract}, issn = {1362-4962}, year = {2006}, date = {2006-01-01}, journal = {Nucleic acids research}, volume = {34}, number = {Database issue}, pages = {D63--7}, abstract = {Information about the genomic coordinates and the sequence of experimentally identified transcription factor binding sites is found scattered under a variety of diverse formats. The availability of standard collections of such high-quality data is important to design, evaluate and improve novel computational approaches to identify binding motifs on promoter sequences from related genes. ABS (http://genome.imim.es/datasets/abs2005/index.html) is a public database of known binding sites identified in promoters of orthologous vertebrate genes that have been manually curated from bibliography. We have annotated 650 experimental binding sites from 68 transcription factors and 100 orthologous target genes in human, mouse, rat or chicken genome sequences. Computational predictions and promoter alignment information are also provided for each entry. A simple and easy-to-use web interface facilitates data retrieval allowing different views of the information. In addition, the release 1.0 of ABS includes a customizable generator of artificial datasets based on the known sites contained in the collection and an evaluation tool to aid during the training and the assessment of motif-finding programs.}, keywords = {Animals, Binding Sites, Chickens, Chickens: genetics, Databases, Genetic, Genomics, Humans, Internet, Mice, Nucleic Acid, Promoter Regions, Rats, Transcription Factors, Transcription Factors: metabolism, User-Computer Interface} } Information about the genomic coordinates and the sequence of experimentally identified transcription factor binding sites is found scattered under a variety of diverse formats. The availability of standard collections of such high-quality data is important to design, evaluate and improve novel computational approaches to identify binding motifs on promoter sequences from related genes. ABS (http://genome.imim.es/datasets/abs2005/index.html) is a public database of known binding sites identified in promoters of orthologous vertebrate genes that have been manually curated from bibliography. We have annotated 650 experimental binding sites from 68 transcription factors and 100 orthologous target genes in human, mouse, rat or chicken genome sequences. Computational predictions and promoter alignment information are also provided for each entry. A simple and easy-to-use web interface facilitates data retrieval allowing different views of the information. In addition, the release 1.0 of ABS includes a customizable generator of artificial datasets based on the known sites contained in the collection and an evaluation tool to aid during the training and the assessment of motif-finding programs. |
2004 |
Gibbs, Richard A, Et al. Genome sequence of the Brown Norway rat yields insights into mammalian evolution. (Article) Nature, 428 (6982), pp. 493–521, 2004, ISSN: 1476-4687. (Abstract | Links | BibTeX | Tags: Animals, Base Composition, Centromere, Centromere: genetics, Chromosomes, CpG Islands, CpG Islands: genetics, DNA, DNA Transposable Elements, DNA Transposable Elements: genetics, Evolution, Gene Duplication, Genome, Genomics, Humans, Inbred BN, Inbred BN: genetics, Introns, Introns: genetics, Male, Mammalian, Mammalian: genetics, Mice, Mitochondrial, Mitochondrial: genetics, Models, Molecular, Mutagenesis, Nucleic Acid, Nucleic Acid: genetics, Polymorphism, Rats, Regulatory Sequences, Retroelements, Retroelements: genetics, RNA, RNA Splice Sites, RNA Splice Sites: genetics, Sequence Analysis, Single Nucleotide, Single Nucleotide: genetics, Telomere, Telomere: genetics, Untranslated, Untranslated: genetics) @article{Gibbs2004, title = {Genome sequence of the Brown Norway rat yields insights into mammalian evolution.}, author = {Gibbs, Richard A and Et al.}, url = {http://www.ncbi.nlm.nih.gov/pubmed/15057822}, issn = {1476-4687}, year = {2004}, date = {2004-01-01}, journal = {Nature}, volume = {428}, number = {6982}, pages = {493--521}, abstract = {The laboratory rat (Rattus norvegicus) is an indispensable tool in experimental medicine and drug development, having made inestimable contributions to human health. We report here the genome sequence of the Brown Norway (BN) rat strain. The sequence represents a high-quality 'draft' covering over 90% of the genome. The BN rat sequence is the third complete mammalian genome to be deciphered, and three-way comparisons with the human and mouse genomes resolve details of mammalian evolution. This first comprehensive analysis includes genes and proteins and their relation to human disease, repeated sequences, comparative genome-wide studies of mammalian orthologous chromosomal regions and rearrangement breakpoints, reconstruction of ancestral karyotypes and the events leading to existing species, rates of variation, and lineage-specific and lineage-independent evolutionary events such as expansion of gene families, orthology relations and protein evolution.}, keywords = {Animals, Base Composition, Centromere, Centromere: genetics, Chromosomes, CpG Islands, CpG Islands: genetics, DNA, DNA Transposable Elements, DNA Transposable Elements: genetics, Evolution, Gene Duplication, Genome, Genomics, Humans, Inbred BN, Inbred BN: genetics, Introns, Introns: genetics, Male, Mammalian, Mammalian: genetics, Mice, Mitochondrial, Mitochondrial: genetics, Models, Molecular, Mutagenesis, Nucleic Acid, Nucleic Acid: genetics, Polymorphism, Rats, Regulatory Sequences, Retroelements, Retroelements: genetics, RNA, RNA Splice Sites, RNA Splice Sites: genetics, Sequence Analysis, Single Nucleotide, Single Nucleotide: genetics, Telomere, Telomere: genetics, Untranslated, Untranslated: genetics} } The laboratory rat (Rattus norvegicus) is an indispensable tool in experimental medicine and drug development, having made inestimable contributions to human health. We report here the genome sequence of the Brown Norway (BN) rat strain. The sequence represents a high-quality 'draft' covering over 90% of the genome. The BN rat sequence is the third complete mammalian genome to be deciphered, and three-way comparisons with the human and mouse genomes resolve details of mammalian evolution. This first comprehensive analysis includes genes and proteins and their relation to human disease, repeated sequences, comparative genome-wide studies of mammalian orthologous chromosomal regions and rearrangement breakpoints, reconstruction of ancestral karyotypes and the events leading to existing species, rates of variation, and lineage-specific and lineage-independent evolutionary events such as expansion of gene families, orthology relations and protein evolution. |
Huang, Hui, Winter, Eitan E, Wang, Huajun, Weinstock, Keith G, Xing, Heming, Goodstadt, Leo, Stenson, Peter D, Cooper, David N, Smith, Douglas, Albà, M Mar, Ponting, Chris P, Fechtel, Kim Genome biology, 5 (7), pp. R47, 2004, ISSN: 1465-6914. (Abstract | Links | BibTeX | Tags: Amino Acid, Amino Acid: genetics, Animal, Animals, Chromosome Mapping, Chromosome Mapping: methods, Conserved Sequence, Conserved Sequence: genetics, Disease Models, Evolution, Fishes, Fishes: genetics, Fungal, Fungal: genetics, Genes, Genes: genetics, Genes: physiology, Genetic, Genetic Diseases, Genome, Helminth, Helminth: genetics, human, Humans, Inborn, Inborn: genetics, Inborn: physiopathology, Insect, Insect: genetics, Mice, Molecular, Mutagenesis, Mutagenesis: genetics, Nucleic Acid, Nucleotides, Nucleotides: genetics, Point Mutation, Point Mutation: genetics, Rats, Repetitive Sequences, Selection, Sequence Homology, Trinucleotide Repeat Expansion, Trinucleotide Repeat Expansion: genetics) @article{Huang2004, title = {Evolutionary conservation and selection of human disease gene orthologs in the rat and mouse genomes.}, author = {Huang, Hui and Winter, Eitan E and Wang, Huajun and Weinstock, Keith G and Xing, Heming and Goodstadt, Leo and Stenson, Peter D and Cooper, David N and Smith, Douglas and Albà, M Mar and Ponting, Chris P and Fechtel, Kim}, url = {http://www.pubmedcentral.nih.gov/articlerender.fcgi?artid=463309&tool=pmcentrez&rendertype=abstract}, issn = {1465-6914}, year = {2004}, date = {2004-01-01}, journal = {Genome biology}, volume = {5}, number = {7}, pages = {R47}, abstract = {Model organisms have contributed substantially to our understanding of the etiology of human disease as well as having assisted with the development of new treatment modalities. The availability of the human, mouse and, most recently, the rat genome sequences now permit the comprehensive investigation of the rodent orthologs of genes associated with human disease. Here, we investigate whether human disease genes differ significantly from their rodent orthologs with respect to their overall levels of conservation and their rates of evolutionary change.}, keywords = {Amino Acid, Amino Acid: genetics, Animal, Animals, Chromosome Mapping, Chromosome Mapping: methods, Conserved Sequence, Conserved Sequence: genetics, Disease Models, Evolution, Fishes, Fishes: genetics, Fungal, Fungal: genetics, Genes, Genes: genetics, Genes: physiology, Genetic, Genetic Diseases, Genome, Helminth, Helminth: genetics, human, Humans, Inborn, Inborn: genetics, Inborn: physiopathology, Insect, Insect: genetics, Mice, Molecular, Mutagenesis, Mutagenesis: genetics, Nucleic Acid, Nucleotides, Nucleotides: genetics, Point Mutation, Point Mutation: genetics, Rats, Repetitive Sequences, Selection, Sequence Homology, Trinucleotide Repeat Expansion, Trinucleotide Repeat Expansion: genetics} } Model organisms have contributed substantially to our understanding of the etiology of human disease as well as having assisted with the development of new treatment modalities. The availability of the human, mouse and, most recently, the rat genome sequences now permit the comprehensive investigation of the rodent orthologs of genes associated with human disease. Here, we investigate whether human disease genes differ significantly from their rodent orthologs with respect to their overall levels of conservation and their rates of evolutionary change. |
Albà, M Mar, Guigó, Roderic Comparative analysis of amino acid repeats in rodents and humans. (Article) Genome research, 14 (4), pp. 549–54, 2004, ISSN: 1088-9051. (Abstract | Links | BibTeX | Tags: Amino Acid, Amino Acid: genetics, Amino Acid: physiology, Animals, Chromosome Mapping, Chromosome Mapping: methods, Chromosome Mapping: statistics & numerical data, Computational Biology, Computational Biology: methods, Computational Biology: statistics & numerical data, GC Rich Sequence, GC Rich Sequence: genetics, Humans, Mice, Proteins, Proteins: chemistry, Proteins: genetics, Proteins: physiology, Rats, Repetitive Sequences, Trinucleotide Repeats, Trinucleotide Repeats: genetics) @article{Alba2004, title = {Comparative analysis of amino acid repeats in rodents and humans.}, author = {Albà, M Mar and Guigó, Roderic}, url = {http://www.pubmedcentral.nih.gov/articlerender.fcgi?artid=383298&tool=pmcentrez&rendertype=abstract}, issn = {1088-9051}, year = {2004}, date = {2004-01-01}, journal = {Genome research}, volume = {14}, number = {4}, pages = {549--54}, abstract = {Amino acid tandem repeats, also called homopolymeric tracts, are extremely abundant in eukaryotic proteins. To gain insight into the genome-wide evolution of these regions in mammals, we analyzed the repeat content in a large data set of rat-mouse-human orthologs. Our results show that human proteins contain more amino acid repeats than rodent proteins and that trinucleotide repeats are also more abundant in human coding sequences. Using the human species as an outgroup, we were able to address differences in repeat loss and repeat gain in the rat and mouse lineages. In this data set, mouse proteins contain substantially more repeats than rat proteins, which can be at least partly attributed to a higher repeat loss in the rat lineage. The data are consistent with a role for trinucleotide slippage in the generation of novel amino acid repeats. We confirm the previously observed functional bias of proteins with repeats, with overrepresentation of transcription factors and DNA-binding proteins. We show that genes encoding amino acid repeats tend to have an unusually high GC content, and that differences in coding GC content among orthologs are directly related to the presence/absence of repeats. We propose that the different GC content isochore structure in rodents and humans may result in an increased amino acid repeat prevalence in the human lineage.}, keywords = {Amino Acid, Amino Acid: genetics, Amino Acid: physiology, Animals, Chromosome Mapping, Chromosome Mapping: methods, Chromosome Mapping: statistics & numerical data, Computational Biology, Computational Biology: methods, Computational Biology: statistics & numerical data, GC Rich Sequence, GC Rich Sequence: genetics, Humans, Mice, Proteins, Proteins: chemistry, Proteins: genetics, Proteins: physiology, Rats, Repetitive Sequences, Trinucleotide Repeats, Trinucleotide Repeats: genetics} } Amino acid tandem repeats, also called homopolymeric tracts, are extremely abundant in eukaryotic proteins. To gain insight into the genome-wide evolution of these regions in mammals, we analyzed the repeat content in a large data set of rat-mouse-human orthologs. Our results show that human proteins contain more amino acid repeats than rodent proteins and that trinucleotide repeats are also more abundant in human coding sequences. Using the human species as an outgroup, we were able to address differences in repeat loss and repeat gain in the rat and mouse lineages. In this data set, mouse proteins contain substantially more repeats than rat proteins, which can be at least partly attributed to a higher repeat loss in the rat lineage. The data are consistent with a role for trinucleotide slippage in the generation of novel amino acid repeats. We confirm the previously observed functional bias of proteins with repeats, with overrepresentation of transcription factors and DNA-binding proteins. We show that genes encoding amino acid repeats tend to have an unusually high GC content, and that differences in coding GC content among orthologs are directly related to the presence/absence of repeats. We propose that the different GC content isochore structure in rodents and humans may result in an increased amino acid repeat prevalence in the human lineage. |
Publication List
Amino Acid Animals Computational Biology Databases de novo gene DNA Evolution Genetic Genome human Humans Mice Molecular Molecular Sequence Data Proteins Proteins: chemistry Proteins: genetics Repetitive Sequences ribosome profiling RNA-Seq Selection Sequence Analysis Sequence Homology transcriptomics yeast
2006 |
ABS: a database of Annotated regulatory Binding Sites from orthologous promoters. (Article) Nucleic acids research, 34 (Database issue), pp. D63–7, 2006, ISSN: 1362-4962. |
2004 |
Genome sequence of the Brown Norway rat yields insights into mammalian evolution. (Article) Nature, 428 (6982), pp. 493–521, 2004, ISSN: 1476-4687. |
Genome biology, 5 (7), pp. R47, 2004, ISSN: 1465-6914. |
Comparative analysis of amino acid repeats in rodents and humans. (Article) Genome research, 14 (4), pp. 549–54, 2004, ISSN: 1088-9051. |