Plot-hits, plotting blast hits in annotated contig or chromosome

The wirk is supported by Russian Foundation for Basic Research grant no. 14-04-01693

The plot-hits utilities facilitate human judgement about gene annotations by visualizing blast hits in contigs and chromosomes.

The two sets of data are displayed simultaneousely in coordinates of a contig/chromosome. First set contains gene coordinates from a contig or a chromosome annotations. Second set contains blast hits of the contig or the chromosome search against protein database, for example, Uniref90. The result is demonstrated in Fig.1

Fig.1. Sample output. Part of contig named NODE_4 from 3000 to 11000 bp is shown.

X-axis is coordinate in the contig. Y-axis is hit identity (% of indentical aa's). Annotated genes are named g3, g4, etc. and shown by colored vertical bars. The color sheme allows distinguishing
  • genes on direct (cold shades) and reverse (warm shades) strands;
  • exons (more intensive shades) and introns (less intensive shades);
  • frame shifts of exons (different colors).

Blastx hits are horizontal segments, their height correspods to hit identity (%). Hit color corresponds to reading frame in the contig, color intensity is grteater than intensities of exons.
Inconsistancies in NODE_4:
  • Reading frame of five blast hits in the third exon of gene g4 are different than exon's reading frame
  • Blast hits in gene g3 located on reverse strand do not match reading frames of the 1st and 3rd exons and overlap 1-2 and 2-3 introns

Testing

Plot-hits was used to check annotations of de novo assembly of algal eukaryotic parasite Amoeboaphelidium protococcarum genome. A number of missed gene annotations and missannotations were found after initial first round of gene annotation by Augustus, allowing improving the final gene annotations.

How it was done (coming soon...)

Requirements

Instead, python anaconda package may by installed. The package includes both libraries

No installation program is needed

Input data formats

Three parameters are obligatory for calling a program:

Usage of the program plot-hits-matplotlib

For speeding up it is recomended first, to select lines with target contig names from big input files into smaller files using grep or analogous programs.

....