Welcome to the blog


My thoughts and ideas

Variant annotation and interpretation | Griffith Lab

Genomic Visualization and Interpretations

Variant annotation and interpretation

When variants are identified in the genome (or transcriptome) some kind of annotation and need for interpretation invariably follows. There are many, many tools for annotation and interpretation in different contexts and for different purposes. In this section we explore just a few of these many options. First we will learn to use Ensembl’s Variant Effect Predictor (VEP), a popular and widely used variant transcript annotator. VEP has many functions, but it is first used to annotate variants in the context of set of known transcripts. The other resources we will use, ClinVar and CIViC attempt to summarize evidence for the clinical relevance of variants in inherited human diseases and cancer respectively.

Some here are some examples of variant annotation and interpretation contexts:

  • Population frequency/recurrence (is the variant common, rare, rare?)
  • Transcripts (does the variant occur within a transcribed region of a gene? Does is affect the predicted translation of that transcript?)
  • Function (is the variant likely to disrupt the normal function of a gene?). There are many, many approaches to this.
    • Conservation of the affected region
    • Predicted biochemical significance of amino acid alterations
    • Occurence in know functional domains (e.g. the binding pocket of a kinase)
    • Hot spots of variantion (some patterns can suggest gain-of-function)
      • 2D hotspots
      • 3D hotspots
    • Patterns that suggest loss of function
    • Actual experimental evidence for the specific variant or one very similar
  • What other approaches can you think of?

Module 5 Lecture

Pathway visualization | Griffith Lab

Genomic Visualization and Interpretations

Pathway visualization

A common task after pathway analysis is contructing visualizations to represent experimental data for pathways of interest. There are many tools for this. We will focus on the bioconductor pathview package for this task.


Pathview is used to integrate and display data on KEGG pathway maps that it retrieves through API queries to the KEGG database. Please refer to the pathview vignette and KEGG website for license information as there may be restrictions for commercial use due for these API queries. Pathview itself is open source and is able to map a wide variety of biological data relevant to pathway views. In this section we will be mapping the overall expression results for a few pathways from the pathway analysis section of this course. Let’s start by installing pathview from bioconductor and loading the data we created in the previous section.

# Install pathview from bioconductor


Visualizing KEGG pathways

Now that we have our initial data loaded let’s choose a few pathways to visualize. The “Mismatch repair” repair pathway is significantly perturbed by up regulated genes, and corresponds to the following kegg id: “hsa03430”. We can view this using the row names of the pathway dataset fc.kegg.sigmet.p.up. Let’s use our experiment’s expression in the data frame tumor_v_normal_DE.fc and view it in the context of this pathway. Two graphs will be written to your current working directory by the pathview() function, one will be the original kegg pathway view and the second one will have expression values overlayed (see below). You can find your current working directory with the function getwd().

# View the hsa03430 pathway from the pathway analysis
fc.kegg.sigmet.p.up[grepl("hsa03430", rownames(fc.kegg.sigmet.p.up), fixed=TRUE),]

# Overlay the expression data onto this pathway
pathview(gene.data=tumor_v_normal_DE.fc, species="hsa", pathway.id="hsa03430")

It is often nice to see the relationship between genes in the kegg pathview diagrams, this can be achieved by setting the parameter kegg.native=FALSE. Below we show an example for the Fanconi anemia pathway.

# View the hsa03430 pathway from the pathway analysis
fc.kegg.sigmet.p.up[grepl("hsa03460", rownames(fc.kegg.sigmet.p.up), fixed=TRUE),]

# Overlay the expression data onto this pathway
pathview(gene.data=tumor_v_normal_DE.fc, species="hsa", pathway.id="hsa03460", kegg.native=FALSE)