Skip to content

Pipeline to analyze structural variants producing VCF files, then merge them and annotate if in Exon.


Notifications You must be signed in to change notification settings



Repository files navigation

The structural variation pipeline calls structural variants in Illumina paired-end reads from whole genome mouse data relative to GRCm38 (mm10). The pipeline maps reads to the reference, and then passes mapped reads to four structural variant callers: Breakdancer, Lumpy, Delly, and Manta. Structural variant calls from the four individual callers are then merged with Survivor.

Calls are merged by type, within a +/- 1000bp buffer around each call. Merged calls provide the number of callers supporting each call.

Finally calls are annotated to include if they are within a defined exon (exons boundaries were extracted by the R package annotatr, which uses the TxDb.Mmusculus.UCSC.mm10.knownGene resource.

Annotations in that package were drawn from resources at UCSC on 2019-10-21 20:52:26 +0000 (Mon, 21 Oct 2019) and based on the mm10 genome based on the knownGene table)

Structural variant type are classed into the following types:

  • INS – Insertion
  • INV – Inversion
  • DEL – Deletion
  • DUP – Duplication
  • TRA – Translocation


Pipeline to analyze structural variants producing VCF files, then merge them and annotate if in Exon.








  • Nextflow 55.0%
  • Python 24.3%
  • Groovy 16.0%
  • Shell 4.7%