ANNOVAR: Genetic Variant Annotation
Watch a short video

BIOBASE ANNOVAR, functional annotation of genetic variants from high-throughput sequencing data, is an efficient command line Perl program to functionally annotate genetic variants from diverse genomes (including human genome hg18, hg19, as well as mouse, worm, fly, yeast and many others).

High-throughput sequencing platforms are generating massive amounts of genetic variation data, and it remains a challenge to pinpoint a small subset of functionally important variants.  ANNOVAR was developed to fill the need and shortlist single nucleotide variants and insertions/deletions, by up-to-date annotation, examining their functional consequence on genes, inferring cytogenetic bands, reporting functional importance scores, finding variants in conserved regions, or identifying variants reported in the 1000 Genomes project and dbSNP.

BIOBASE Genome TraxTM, which includes data from HGMD®, PROTEOMETM and TRANSFAC®, is fully compatible with ANNOVAR for the most comprehensive analysis.  With ANNOVAR and Genome Trax combined you can identify and annotate known disease causing inherited mutations in your data-set.  It is also possible to identify known transcription factor binding sites and pathway, drug and disease relations.  ANNOVAR is cited by over 200 scientific publications.

Benefits

  • Minimal setup annotation: instead of spending weeks or months on writing your own scripts to add annotation capabilities to your pipeline, deploy ANNOVAR and instantly be ready to annotate
  • Filter and annotate variants from whole genome and exome sequencing in batch, making use of a wide array of prepared and up-to-date annotation sources
  • Retrieves the most up-to-date versions of annotation sources
  • Configurable to enhance your specific pipeline (Perl source code)
  • Retrieve the nucleotide and protein sequence for any user-specific genomic positions in batch, identify a candidate gene list for Mendelian diseases from exome data, and other utilities
  • Generate Excel-compatible files for manual examination of a diverse range of functional annotations for your exome/genome data

Key Features

  • Annotate a whole genome in about 4 minutes or identify important variants in about 15 minutes. Handle hundreds of human genomes per day on a desktop machine
  • Variant summaries: Excel-compatible files for review with gene annotation, amino acid change annotation, SIFT , PolyPhen, LRT and MutationTaster scores, PhyloP and GERP++ conservation scores, dbSNP identifiers, 1000 Genomes and ESP exome project allele frequencies and other information for all variants in a genome, from standard variant files
  • Use a configurable ‘variant reduction’ protocol based on annotation filtering and selection steps to identify a small subset of most likely causal variants
  • Ready-to-go annotation: utilize HGMD® and other Genome Trax™ annotation (license required), any annotation track from the UCSC Genome Browser, any annotation conforming to Generic Feature Format version 3 (GFF3) and precompiled up-to-date data sources:
    • variants reported in dbSNP
    • common SNPs (MAF >1%) from the 1000 Genome Project
    • non-synonymous SNPs
    • variants that affect protein sequence
    • SNPs predicted to be deleterious in tools like SIFT, PolyPhen2, MutationTaster
    • intergenic variants with high conservation ( GERP++ or PhyloP scores)
    • variants in specific genomic regions, for example, conserved in 44 species, transcription factor binding sites, segmental duplication regions, GWAS hits, DNAse I hypersensitivity sites
    • many other annotations on genomic intervals and gene function
  • Works with RefSeq, UCSC, ENSEMBL, GENCODE  or many other gene definition systems, and with multiple species

Application Note: Profiling Breast Cancer Variants
Overview Datasheet (pdf)

What Users Say

ANNOVAR fills a huge gap in any NGS analysis pipeline. I didn’t find any tool even remotely comparable in ease of use and speed.


This is a state-of-art, very flexible and useful tool for annotating sequencing variants. The databases are updated frequently by the developer to reflect the most recent progress/changes in the field. We love it and incorporate it into our NGS pipeline for data releasing.


I’m using ANNOVAR for about a year now and I prefer it over all other SNP effect predictors due to its comprehensiveness and ease of use.


ANNOVAR is an excellent flexible tool for analyzing NGS data. Without it, I would not be able to analyze these data. Please continue with the development of this wonderful program!


ANNOVAR is a great versatile tool and has quickly become one of the most widely used annotation platforms for NGS data.


ANNOVAR is the perfect solution for our SNP detection platform. Highly modular and easy to adapt.


I use ANNOVAR to annotate SNPs detected through RNAseq.


ANNOVAR provides a great resource for straightforward annotation of variants from next-generation sequencing experiments, based on a variety of up-to-date databases and is especially useful due to inclusion of 1000Genomes and ESP databases.

 

Access Options

ANNOVAR is freely available only to users from academic institutions/non-profit organizations.  All commercial users are required to purchase a license.

Download
A download subscription allows for local installation of the complete ANNOVAR Perl program.  Download access allows you to query and extract all fields of data for integration into your own, or 3rd party, analysis pipelines and tools. However redistribution of, or public access to / display of, the software or data is not allowed without prior written consent.

ANNOVAR is written in Perl and can be run as a standalone command line application on diverse hardware systems where standard Perl  modules are installed. No interface is provided. Most of the databases that ANNOVAR uses can be directly retrieved from UCSC Genome Browser Annotation Database.

>> Quick start-up guide