Skip to content
Young edited this page Mar 13, 2023 · 10 revisions

Welcome to the wiki for Donut falls!

All "good" bioinformatic tools and workflows are attempting to solve a problem. The problem we ran into was that nf-core's genomeassembler workflow was not completed, yet, and we needed a simple workflow to assembly nanopore sequencing reads with and without corresponding Illumina reads for downstream analyses. This workflow is adequate for most of our day-to-day assembly needs, but we also needed a workflow that pulled together most of the steps of Trycycler.

Nanopore sequence processing is an actively developing field, so tools were chosen due to their acceptance in the field and extracted from the tutorials generated by Dr. Ryan Wick in the Trycycler wiki and Perfect bacterial genome tutorial wiki.

This workflow is intended to be lightweight, so all the scripts in 'bin' are just for use by UPHL locally to create a sample sheet and not used in the workflow. This means that Donut Falls can then be used as a subworkflow for another nextflow workflow, which may be useful to those who want to include basecalling with guppy preceding this workflow and/or extra rounds of polishing after. More information and some basic instructions can be found on the Linking wiki page.

The generated consensus files can then be used in multiple applications, including phylogenetic analysis with Grandeur or submission to NCBI via the genome submissions portal.

This wiki will cover the rationale and steps of this workflow.

Basic diagram of the workflow and subworkflows

More detailed subworkflow diagrams and corresponding parameter explanations can be found in the subworkflow wiki pages.

Nanopore-only workflow

This Nanopore only workflow is the goal for nanopore sequencing, and has been a successful technique for multiple applications.

---
Donut Falls
---
flowchart LR

A[nanopore fastq] --> B[filter]
B[filter] --> C[assembly]
C[assembly] --> D[polish]
D --> consensus
Loading

Nanopore sequencing with polishing

This is generally recommended for samples for which there are Nanopore and Illumina reads. In this subworkflow, reads generated from Nanopore sequencing undergo de novo assembly followed by a "polish" with the higher quality Illumina reads to create a consensus.

---
Donut Falls
---
flowchart LR

E[nanopore fastq] --> F[filter]
G[illumina fastq] --> H[fastp]
F --> I[assembly]
I --> J[polish]
H --> J
J --> consensus
Loading

Hybrid assembly

Nanopore sequencing was becoming popular during a time it was not as accurate as Illumina sequencing. As such, hybrid assemblers were developed that assembled Illumina data and added in Nanopore data to fill in gaps. These tools still work and are still popular, so they are included here.

---
Donut Falls
---
flowchart LR
K[nanopore fastq] --> L[assembly]
M[illumina fastq] --> L
L --> consensus
Loading
Clone this wiki locally