Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Remove gtex #299

Merged
merged 6 commits into from
Feb 10, 2022
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
16 changes: 10 additions & 6 deletions docs/usage.md
Original file line number Diff line number Diff line change
Expand Up @@ -103,13 +103,13 @@ Input files:
(default: no rmats_pairs specified)
--run_name User specified name used as prefix for output files
(defaut: no prefix, only date and time)
--download_from Database to download FASTQ/BAMs from (available = 'TCGA', 'GTEX' or 'GEN3-DRS',
'SRA', 'FTP') (string)
--download_from Database to download FASTQ/BAMs from (available = 'TCGA', 'GTEX', 'SRA', 'FTP')
(string)
false should be used to run local files on the HPC (Sumner).
'TCGA' can also be used to download GDC data including HCMI data.
(default: false)
--key_file For downloading reads, use TCGA authentication token (TCGA) or dbGAP repository
key (GTEx, path) or credentials.json file in case of 'GEN3-DRS'
--key_file For downloading reads, use TCGA authentication token (TCGA) or
credentials.json file in case of 'GTEX'.
(default: false)

Main arguments:
Expand Down Expand Up @@ -246,7 +246,11 @@ Some useful ones include (specified in main.pbs):
- `-with-trace` eg `-with-trace trace.txt` which gives a [trace report](https://www.nextflow.io/docs/latest/tracing.html?highlight=dag#trace-report) for resource consumption by the pipeline
- `-with-dag` eg `-with-dag flowchart.png` which produces the [DAG visualisation](https://www.nextflow.io/docs/latest/tracing.html?highlight=dag#dag-visualisation) graph showing each of the different processes and the connections between them (the channels)

## Run with data from AnviL Gen3-DRS
## Run with GTEX data from AnviL Gen3-DRS
You can run pipeline on GTEX data otained directly from Gen3-DRS if you specify input option:
```
--download_from 'GTEX'
```

You will be needing two things from - https://gen3.theanvil.io/

Expand All @@ -262,7 +266,7 @@ in2csv manifest.json > manifest.csv

NOTE: Make sure the `manifest.csv` file have five columns, Check from [examples](../examples/gen3/)

Downloaded `credentials.json` file can be provided in `--key` param.
Downloaded `credentials.json` file can be provided in `--key_file` param.

NOTE: Make sure `credentials.json` is a latest one. They have expiry dates when you download.

Expand Down
22 changes: 11 additions & 11 deletions main.nf
Original file line number Diff line number Diff line change
Expand Up @@ -37,13 +37,13 @@ def helpMessage() {
(default: no rmats_pairs specified)
--run_name User specified name used as prefix for output files
(defaut: no prefix, only date and time)
--download_from Database to download FASTQ/BAMs from (available = 'TCGA', 'GTEX' or 'GEN3-DRS',
'SRA', 'FTP') (string)
--download_from Database to download FASTQ/BAMs from (available = 'TCGA', 'GTEX', 'SRA', 'FTP')
(string)
false should be used to run local files on the HPC (Sumner).
'TCGA' can also be used to download GDC data including HCMI data.
(default: false)
--key_file For downloading reads, use TCGA authentication token (TCGA) or dbGAP repository
key (GTEx, path) or credentials.json file in case of 'GEN3-DRS'
--key_file For downloading reads, use TCGA authentication token (TCGA) or
credentials.json file in case of 'GTEX'.
(default: false)

Main arguments:
Expand Down Expand Up @@ -268,15 +268,15 @@ log.info "\n"
---------------------------------------------------*/

if (params.download_from) {
if(download_from('gtex') || download_from('sra') || download_from('tcga') ){
if( download_from('sra') || download_from('tcga') ){
Channel
.fromPath(params.reads)
.ifEmpty { exit 1, "Cannot find CSV reads file : ${params.reads}" }
.splitCsv(skip:1)
.map { sample -> sample[0].trim() }
.set { accession_ids }
}
if(download_from('gen3-drs')){
if(download_from('gtex')){
Channel
.fromPath(params.reads)
.ifEmpty { exit 1, "Cannot find CSV reads file : ${params.reads}" }
Expand Down Expand Up @@ -354,7 +354,7 @@ if (params.rmats_pairs) {
.set { samples}
}

if ( download_from('gen3-drs')) {
if ( download_from('gtex')) {
// The fasta obligatory requirement below is removed, because for the foreseeable future GTEX transcriptomic data will be only accessed as bam files, which do not require a fasta file, as CRAM files.
//if(!params.genome_fasta){
//exit 1, "A genome fasta file must be provided in order to convert CRAM files in GEN3-DRS download step."
Expand All @@ -372,10 +372,10 @@ if ( download_from('sra')) {


/*--------------------------------------------------
Download FASTQs from GTEx or SRA
Download FASTQs from SRA
---------------------------------------------------*/

if ( download_from('gtex') || download_from('sra') ) {
if ( download_from('sra') ) {
process get_accession {
publishDir "${params.outdir}/process-logs/${task.process}/${accession}/", pattern: "command-logs-*", mode: 'copy'

Expand Down Expand Up @@ -464,7 +464,7 @@ if ( download_from('ftp') ) {
Download BAMs from GTEx using GEN3_DRS
---------------------------------------------------*/

if ( download_from('gen3-drs')) {
if ( download_from('gtex')) {
process gen3_drs_fasp {
tag "${file_name}"
label 'low_memory'
Expand Down Expand Up @@ -559,7 +559,7 @@ if (download_from('tcga')) {
Bedtools to extract FASTQ from BAM
---------------------------------------------------*/

if (download_from('tcga') || download_from('gen3-drs')) {
if (download_from('tcga') || download_from('gtex')) {
process bamtofastq {
tag "${name}"
label 'mid_memory'
Expand Down