Skip to content

Commit

Permalink
Merge pull request #40 from CDCgov/dev
Browse files Browse the repository at this point in the history
Dev
  • Loading branch information
sateeshperi authored Apr 1, 2022
2 parents 581e7af + f3d5a80 commit 5d4bb50
Show file tree
Hide file tree
Showing 7 changed files with 95 additions and 20 deletions.
28 changes: 27 additions & 1 deletion CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -3,6 +3,32 @@
The format is based on [Keep a Changelog](https://keepachangelog.com/en/1.0.0/)
and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0.html).

## v1.1 Candid Aura - [04/01/2022]

### `Added`

### `Fixed`

* Fixed bug in `modules/local/lane_merge.nf` that was causing samplesheet CSV file to not recognize R2 (closes #39)
* Formatting of `docs/output.md`
* Changed output file `combined/vcf-to-fasta/combined_vcf-to-fasta.fasta` -> `combined/vcf-to-fasta/vcf-to-fasta.fasta`
* Output file `combined/vcf-to-fasta/vcf-to-fasta.fasta` will now replace stars `*` with dashes `-`
* Output file `combined/phylogeny/rapidnj/rapidnj_phylogeny.tre` -> `combined/phylogeny/rapidnj/rapidnj_phylogeny.nh`
* Output file `combined/phylogeny/iqtree/vcf-to-fasta.fasta.treefile` -> `combined/phylogeny/iqtree/iqtree_phylogeny.nh`
* Output file `combined/phylogeny/raxmlng/output.raxml.bestTree` -> `combined/phylogeny/raxmlng/raxmlng_bestTree.nh`
* Output file `combined/phylogeny/raxmlng/output.raxml.support` -> `combined/phylogeny/raxmlng/raxmlng_support.nh`

### `Dependencies`

### `Deprecated`

* `/results/qc` output dir removed

### `TODO`

* Continue improving output docs

---
## v1.0 Espresso Myconaut - [03/25/2022]

Initial release of CDCgov/mycosnp-nf, created with the [nf-core](https://nf-co.re/) template.
Expand Down Expand Up @@ -32,4 +58,4 @@ Initial release of CDCgov/mycosnp-nf, created with the [nf-core](https://nf-co.r
### `TODO`

* Intermediate file cleanup and management
* Update logo and metro-style workflow
* Update logo and metro-style workflow
3 changes: 1 addition & 2 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -115,15 +115,14 @@ nf-core/mycosnp was originally written by CDC.

We thank the following people for their extensive assistance in the development of this pipeline:

<!-- TODO nf-core: If applicable, make list of people who have also contributed -->

* Michael Cipriano [@mjcipriano](https://github.com/mjcipriano)
* Sateesh Peri [@sateeshperi](https://github.com/sateeshperi)
* Hunter Seabolt [@hseabolt](https://github.com/hseabolt)
* Chris Sandlin [@cssandlin](https://github.com/cssandlin)
* Drewry Morris [@drewry](https://github.com/drewry)
* Lynn Dotrang [@leuthrasp](https://github.com/LeuThrAsp)
* Christopher Jossart [@cjjossart](https://github.com/cjjossart)
* Robert A. Petit III [@rpetit3](https://github.com/rpetit3)
## Contributions and Support

If you would like to contribute to this pipeline, please see the [contributing guidelines](.github/CONTRIBUTING.md).
Expand Down
5 changes: 2 additions & 3 deletions bin/qc_report_stats.py
Original file line number Diff line number Diff line change
Expand Up @@ -25,10 +25,9 @@
# Trim newline
line = line.rstrip()
if line.startswith(">"):
# If we captured one before, print it now
# Skip lines with ">"
if header is not None:
print(header, length)
length = 0
continue
header = line[1:]
else:
length += len(line)
Expand Down
27 changes: 21 additions & 6 deletions conf/modules.config
Original file line number Diff line number Diff line change
Expand Up @@ -200,6 +200,16 @@ process {
pattern: "*.{fastq.gz,txt}"
]
}
withName: 'QC_REPORT' {
ext.args = { "" }
ext.when = { }
publishDir = [
enabled: "${params.save_alignment}",
mode: "${params.publish_dir_mode}",
path: { "${params.outdir}/samples/${meta.id}/qc_report" },
pattern: "*.{txt}"
]
}
withName: 'BWA_MEM' {
ext.args = { "" }
ext.when = { }
Expand Down Expand Up @@ -420,7 +430,7 @@ process {
withName: 'VCF_TO_FASTA' {
ext.when = { }
publishDir = [
enabled: true,
enabled: false,
mode: "${params.publish_dir_mode}",
path: { "${params.outdir}/combined/vcf-to-fasta" },
pattern: "*{fasta}"
Expand Down Expand Up @@ -455,12 +465,13 @@ process {
ext.args = { "-s -p '\\*' -r '-'" }
ext.suffix = { "fasta" }
ext.errorStrategy = { "ignore" }
ext.prefix = { "vcf-to-fasta" }
ext.when = { }
publishDir = [
enabled: false,
enabled: true,
mode: "${params.publish_dir_mode}",
path: { "${params.outdir}/combined/phylogeny/fasta" },
pattern: "*"
path: { "${params.outdir}/combined/vcf-to-fasta" },
pattern: "vcf-to-fasta.fasta"
]
}
withName: 'RAPIDNJ' {
Expand All @@ -469,8 +480,9 @@ process {
ext.when = { }
publishDir = [
enabled: true,
mode: "${params.publish_dir_mode}",
path: { "${params.outdir}/combined/phylogeny/rapidnj" },
mode: "${params.publish_dir_mode}",
saveAs: { filename -> filename.endsWith(".tre") ? "rapidnj_phylogeny.nh" : filename },
path: { "${params.outdir}/combined/phylogeny/rapidnj" },
pattern: "*"
]
}
Expand All @@ -481,6 +493,7 @@ process {
publishDir = [
enabled: true,
mode: "${params.publish_dir_mode}",
saveAs: { filename -> filename.endsWith(".tre") ? "fasttree_phylogeny.nh" : filename },
path: { "${params.outdir}/combined/phylogeny/fasttree" },
pattern: "*"
]
Expand All @@ -492,6 +505,7 @@ process {
publishDir = [
enabled: true,
mode: "${params.publish_dir_mode}",
saveAs: { filename -> filename.endsWith(".treefile") ? "iqtree_phylogeny.nh" : filename },
path: { "${params.outdir}/combined/phylogeny/iqtree" },
pattern: "*"
]
Expand All @@ -503,6 +517,7 @@ process {
publishDir = [
enabled: true,
mode: "${params.publish_dir_mode}",
saveAs: { filename -> if( filename.endsWith(".bestTree")) { return "raxmlng_bestTree.nh" } else if ( filename.endsWith(".support") ) { return "raxmlng_support.nh" } else { return filename } },
path: { "${params.outdir}/combined/phylogeny/raxmlng" },
pattern: "*"
]
Expand Down
48 changes: 42 additions & 6 deletions docs/output.md
Original file line number Diff line number Diff line change
Expand Up @@ -18,23 +18,59 @@ results/
├── input
├── multiqc
├── pipeline_info
├── qc
├── reference
├── samples
└── stats
```

## Pipeline overview
## Pipeline Overview

The pipeline is built using [Nextflow](https://www.nextflow.io/) and processes data using the following steps:

- [CDCgov/mycosnp: Output](#cdcgov-mycosnp-output)
- [Introduction](#introduction)
- [Pipeline Overview](#pipeline-overview)
- [BWA Reference](#bwa-reference)
- [Reference Preparation](#reference-preparation)
- [BWA Pre-process](#bwa-pre-process)
- [Sample QC and Processing](#sample-qc-and-processing)
- [GATK Variants](#gatk-variants)
- [Variant calling and analysis](#variant-calling-and-analysis)
- [Summary Files](#summary-files)
- [FastQC](#fastqc)
- [QC report](#qc-report)
- [MultiQC](#multiqc)
- [Pipeline Information](#pipeline-information)

## BWA Reference

### Reference Preparation

<details markdown="1">
<summary>Output files</summary>

* `reference/bwa/bwa`
* `reference.amb`
* `reference.ann`
* `reference.bwt`
* `reference.pac`
* `reference.sa`
* `reference/dict`
* `reference.dict`
* `reference/fai`
* `reference.fa.fai`
* `reference/masked`
* `reference.fa`

</details>

> **Prepares a reference FASTA file for BWA alignment and GATK variant calling by masking repeats in the reference and generating the BWA index.**
* Genome repeat identification and masking (`nucmer`)
* BWA index generation (`bwa`)
* FAI and DICT file creation (`Picard`, `Samtools`)

## BWA Pre-process

### Sample QC and Processing

> **Prepares samples (paired-end FASTQ files) for GATK variant calling by aligning the samples to a BWA reference index and ensuring that the BAM files are correctly formatted. This step also provides different quality reports for sample evaluation.**
Expand Down Expand Up @@ -64,9 +100,11 @@ The pipeline is built using [Nextflow](https://www.nextflow.io/) and processes d
| QC Report | `(stats/qc_report)` |
| MultiQC | `(multiqc/)` |

## GATK Variants

### Variant calling and analysis

> **Calls variants and generates a multi-FASTA file and phylogeny.**
> **Calls variants, generates a multi-FASTA file, and creates phylogeny.**
* Call variants (`GATK HaplotypeCaller`).
* Combine gVCF files from the HaplotypeCaller into a single VCF (`GATK CombineGVCFs`).
Expand All @@ -93,9 +131,7 @@ The pipeline is built using [Nextflow](https://www.nextflow.io/) and processes d
| Phylogeny files | `(combined/phylogeny/)` |


### Summary Files

* [Pipeline information](#pipeline-information) - Report metrics generated during the workflow execution
## Summary Files

### FastQC

Expand Down
2 changes: 1 addition & 1 deletion modules/local/lane_merge.nf
Original file line number Diff line number Diff line change
Expand Up @@ -26,7 +26,7 @@ process LANE_MERGE {
gzip -c ${reads[0]} > combined/${meta.id}.fastq.gz
elif [[ $numReads == 2 ]]; then
gzip -c ${reads[0]} > combined/${meta.id}_R1.fastq.gz
gzip -c ${reads[1]} > combined/${meta.id}_R2.$fileEnding
gzip -c ${reads[1]} > combined/${meta.id}_R2.fastq.gz
elif [[ $numReads == 4 ]]; then
gzip -c ${reads[0]} ${reads[2]} > combined/${meta.id}_R1.fastq.gz
gzip -c ${reads[1]} ${reads[3]} > combined/${meta.id}_R2.fastq.gz
Expand Down
2 changes: 1 addition & 1 deletion nextflow.config
Original file line number Diff line number Diff line change
Expand Up @@ -197,7 +197,7 @@ manifest {
description = 'MycoSNP is a portable workflow for performing whole genome sequencing analysis of fungal organisms, including Candida auris. This method prepares the reference, performs quality control, and calls variants using a reference. MycoSNP generates several output files that are compatible with downstream analytic tools, such as those for used for phylogenetic tree-building and gene variant annotations.'
mainScript = 'main.nf'
nextflowVersion = '!>=21.10.3'
version = 'v1.0'
version = 'v1.1'
}

// Load modules.config for DSL2 module specific options
Expand Down

0 comments on commit 5d4bb50

Please sign in to comment.