Skip to content

v2.3.0

Latest
Compare
Choose a tag to compare
@sage-wright sage-wright released this 19 Dec 20:43
f81fdb1

Public Health Bioinformatics v2.3.0 Minor Release

This minor release adds two new workflows, Fetch_SRR_Accession_PHB and Concatenate_Illumina_Lanes_PHB, and makes significant improvements to the TheiaCoV, TheiaEuk, TheiaProk, and TheiaMeta workflow series. Documentation updates and various bug fixes have also been implemented.

Full release notes can be found here!

Find our documentation here!

🆕 New workflows

  • Concatenate_Illumina_Lanes_PHB

    • Some Illumina sequencing platforms produce FASTQ files split across multiple lanes for a single sample. This workflow combines multi-lane FASTQ files from Illumina sequencing runs into a single read1 and read2 file per sample. This workflow is ideal for Illumina sequencing outputs where data from multiple lanes must be combined to proceed with analysis workflows such as assembly or variant calling as it ensures that downstream workflows receive consolidated FASTQ files
    • This workflow is designed to run automatically at the start of the TheiaProk workflow if multi-lane FASTQ files are provided (e.g., read1_lane2.fastq.gz and read2_lane2.fastq.gz)
    • Import this workflow from Dockstore
  • Fetch_SRR_Accession_PHB

    • This workflow will retrieve any Sequence Read Archive (SRA) accessions (SRR) associated with a given sample accession, such as a BioSample ID (e.g., "SAMN00000000") or SRA Experiment ID (e.g., "SRX000000").
      • This process utilizes the fastq-dl tool to fetch metadata from SRA and outputs the corresponding SRR accession(s).
      • If multiple SRR accessions are linked to a single sample, the workflow will output them as a comma-separated list.
    • This workflow is particularly useful for retrieving SRR accessions a few days after running Terra_2_NCBI workflows.
    • Import this workflow from Dockstore

🚀 Changes to existing workflows

  • All Genomic Characterization Workflows

    • The read screen is now compatible with Dorado-produced FASTQ files
  • All Illumina Workflows

    • fastq_scan has been updated to the latest version
  • All TheiaCoV Workflows

    • The percentage of mapped reads is now output in all TheiaCoV workflows (except TheiaCoV_FASTA)
    • The default Nextclade dataset tags have been updated for SC2, mpox, flu, RSV-A, and RSV-B
    • The default Pangolin docker is now us-docker.pkg.dev/general-theiagen/staphb/pangolin:4.3.1-pdata-1.31
    • Kraken2 standalone is now used and databases must be provided.
  • TheiaCoV_Illumina_PE and TheiaCoV_ONT

    • Default parameters have been set for H5N1 flu
    • IRMA assembled flu segments now in sorted order
  • All TheiaEuk Workflows

    • Additional genes for Candida auris are now examined by default in the Snippy_Gene_Query task
    • Bug fix to the snippy_variants_num_variants output column for Cryptococcus neoformans
  • TheiaMeta_Illumina_PE

    • MIDAS is now an optional task in TheiaMeta.
  • All TheiaProk Workflows

    • stxtyper was added to all TheiaProk workflows
  • TheiaProk_Illumina_PE and TheiaProk_Illumina_SE

    • Multi-lane Illumina data can now be used as input natively.
  • TheiaProk_Illumina_PE and TheiaProk_ONT

    • TBProfiler has been updated to v6.4.1
    • tbp-parser has been updated to v2.2.2
  • Augur_PHB

    • Versioning information for the tree-building tools is now available
  • All Freyja Workflows

    • Freyja now supports non-SARS-CoV-2 organisms natively.
  • Mercury_Prep_N_Batch

    • Errors no longer occur when data has been previously transferred
    • The correct information is now being provided for GISAID’s covv_coverage column for ClearLabs data
    • Failures now fail the task
  • Snippy Workflows

    • A new file with QC metrics has been created
    • Additional QC metrics are now output
  • Terra_2_NCBI_PHB

    • Collection dates will no longer have decimals

📚 Documentation Updates

  • Search tables better with table-specific search bars
  • Dead links removed
  • Generally improved documentation

What's Changed

  • [Documentation] Updated Snippy variants output documentation by @fraser-combe in #623
  • [TheiaCoV] iVar Consensus Pipefail fix by @Michal-Babins in #629
  • [TheiaProk] expose sistr optional param inputs to theiaProk wfs by @fraser-combe in #603
  • [Documentation] fix broken links by @sage-wright in #627
  • Snippy_Variants: Calculate % reads aligned by @fraser-combe in #616
  • [Augur +TheiaCoV] Enable H5N1 flu subtype augur & nextclade by @Michal-Babins in #640
  • [TheiaMeta] Midas call in read_QC_trim_pe.wdl workflow and outputs by @fraser-combe in #619
  • [TheiaCoV] Reorder flu segments from largest to smallest in irma task by @Michal-Babins in #635
  • [Mercury] prevent silent failures by @sage-wright in #648
  • Fixed theiacov documentation to specify assembly order by @Michal-Babins in #652
  • [TheiaCov & TheiaProk & TheiaEuk] read screen ONT bugfix and improvements by @kapsakcj in #650
  • [TheiaCoV ONT and Clearlabs] Update consensus task container to artic:1.2.4-1.12.0 by @cimendes in #636
  • [Documentation] Search bar for tables within docs by @fraser-combe in #646
  • [TheiaEuk] Additional genes for Snippy_Gene_Query by @sage-wright in #647
  • [MerlinMagic] Fixed output for crypto snippy_variants_num_variants by @Michal-Babins in #654
  • [Documentation] type error correction theiacov wf by @fraser-combe in #660
  • [TheiaProk] Adds stxtyper to merlin_magic and TheiaProk wfs by @kapsakcj in #525
  • [Mercury] bump mercury docker to 1.0.9: bugfix for GISAID metadata covv_coverage column by @kapsakcj in #661
  • [TheiaCov] wfs add percentage_mapped_reads by @fraser-combe in #641
  • [Documentation] Update MIDAS database documentation in TheiaProk by @fraser-combe in #667
  • Add Snippy_Variants QC outputs to Snippy_Tree and Snippy_Sreamline workflow outputs by @jrotieno in #592
  • [TheiaCoV/TheiaProk/TheiaMeta/TheiaEuk/Freyja_FASTQ] fastq-scan updates & improvements. Adding JSON as wf output file by @kapsakcj in #662
  • Prevent Silent Errors by @sage-wright in #666
  • [Augur] Add augur tree iqtree model type to output by @Michal-Babins in #674
  • [Terra2NCBI] Force collection_date to be a string by @cimendes in #658
  • [Documentation] Update code contribution guidelines by @fraser-combe in #675
  • [Retrieve_SRR_Metadata] New wf to retrieve SRR after Terra2NCBI wf by @fraser-combe in #668
  • Documentation Update by @frankambrosio3 in #678
  • [Documentation] Various updates by @sage-wright in #680
  • [TheiaCoV] Update nextclade dataset tags and pangolin docker version by @Michal-Babins in #679
  • [Documentation] update dataset tags by @Michal-Babins in #681
  • [TheiaCoV] Split database from Kraken2_TheiaCoV task by @cimendes in #670
  • [TheiaCoV] Update nextclade dataset tag for H5N1 to the latest version by @Michal-Babins in #683
  • [Freyja] Update freyja to version 1.5.2, expose pathogen flag and minor update to docs by @cimendes in #684
  • [Augur] Expose Augur versions by @Michal-Babins in #686
  • [TheiaProk] Update default versions for TB-Profiler and tbp-parser by @sage-wright in #673
  • v2.3.0 final changes by @sage-wright in #693
  • [Concatenate_Illumina_Lanes] Fix bug when single-end only by @sage-wright in #695
  • Revert "[TheiaCoV ONT and Clearlabs] Update consensus task container to artic:1.2.4-1.12.0" by @sage-wright in #696

New Contributors

Full Changelog: v2.2.1...v2.3.0