Gene / Genome Annotation


Genome annotation consists of adding pertinent information regarding parts of the nucleotide sequence to the genome database. To accomplish this, we first analyze the sequence and determine which parts of the sequence represent what. This is called 'gene prediction' because we are deciding which pieces of sequence correspond to genes and other biological units. The second step is to attach biologic information regarding these biological units.

Types of Gene Annotation

There are many types of gene annotation. In general, any time you are adding data to a gene sequence you are 'annotating' that gene.

Gene Annotation Tools

IMG/Mer - IMG Data Management and Analysis Systems

Integrated Microbial Genomes with Microbiome Samples - Expert Review (IMG/MER) system provides support to individual scientists or group of scientists for annotation, analysis, and review of their microbial community metagenome datasets.

MG-RAST (the Metagenomics RAST) server is an automated analysis platform for metagenomes providing quantitative insights into microbial populations based on sequence data.



Gmod includes Maker and several other annotation options including predictions from SNAP, Augustus, FGENESH, and GeneMark As Well As Final Gene Models from Maker, Est Alignments from Both Exonerate and Blastn, protein alignments from Exonerate and Blastx, and repeats from RepeatMasker and the Maker internal RepeatRunner


Reference Genome Annotation Pipeline

Here is a reference for how to structure your annotation pipeline. Contains a fantastic introduction to gene annotation.

Entrez - Global Query Cross-Database Search System

Very powerful, allows searching of many databases simultaneously for information

This software package focuses mainly on vertebrates

Gene Ontology (GO)

Seeks to provide a unified toolset to create and access gene annotation information, and provide unified vocabulary and representation for gene attributes.

