Freyja-SRA

Automated SRA downloading, processing and Freyja analysis pipeline.

Installation

Local Install via Git

git clone https://github.com/dylanpilz/Freyja-SRA.git
cd Freyja-SRA

Usage

nextflow run main.nf -entry [sra|rerun_demix] -profile [docker|singularity] --accession_list [accession_list.csv] --output_dir [output_dir] --num_samples [num_samples]

Parameters

-entry - The pipeline entry point.
- sra will download, process and run Freyja on the provided SRA accessions.
  - --accession_list - A CSV file containing a list of SRA accessions to download and process. The CSV file should have a header row and the first column should be named accession.
- rerun_demix will run freyja demix step on previously generated variants output files in the provided variants directory. This is useful if you want to run Freyja on existing data with a different barcode set.
  - --variants_dir must contain files in the format [base_name].variants.tsv [base_name].depths.tsv for each sample.
- --output_dir - The final output directory. Creates variants, demix, and covariants subdirectories containing respective output files. (default: ./outputs)
- --num_samples - The number of samples to process. (default: 200)

Configuration

Addtional configuration options can be found in nextflow.config

Data Availability

Freyja-SRA is currently in the process of downloading and processing all publicly available SRA data, fetched with the following search terms:

'(Wastewater[All Fields] OR wastewater metagenome[All Fields]) AND ("Severe acute respiratory syndrome coronavirus 2"[Organism] OR SARS-CoV-2[All Fields])

In addition, to the above search terms, we exclude accessions that don't meet the following metadata requirements:

Missing collection date
Missing catchment size (ww_population)
Missing location (geo_loc_name)

To check the status of each accession, please refer to the sample_status column in data/all_metadata.csv. All currently processed freyja outputs are publicly available via Google Cloud Storage at gs://outbreak-ww-data

Name		Name	Last commit message	Last commit date
Latest commit History 1,723 Commits
.github/workflows		.github/workflows
data		data
modules		modules
scripts		scripts
.gitignore		.gitignore
Dockerfile		Dockerfile
LICENSE		LICENSE
README.md		README.md
environment.yml		environment.yml
main.nf		main.nf
nextflow.config		nextflow.config

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Freyja-SRA

Installation

Local Install via Git

Usage

Parameters

Configuration

Data Availability

About

Releases 2

Packages

Contributors 2

Languages

License

andersen-lab/Freyja-SRA

Folders and files

Latest commit

History

Repository files navigation

Freyja-SRA

Installation

Local Install via Git

Usage

Parameters

Configuration

Data Availability

About

Resources

License

Stars

Watchers

Forks

Releases 2

Packages 0

Contributors 2

Languages

Packages