diff --git a/docs/workflows/genomic_characterization/theiacov.md b/docs/workflows/genomic_characterization/theiacov.md index 480bfbf04..d9345db41 100644 --- a/docs/workflows/genomic_characterization/theiacov.md +++ b/docs/workflows/genomic_characterization/theiacov.md @@ -128,8 +128,8 @@ All TheiaCoV Workflows (not TheiaCoV_FASTA_Batch) | clean_check_reads | **memory** | Int | Amount of memory/RAM (in GB) to allocate to the task | 2 | Optional | ONT, PE, SE | HIV, MPXV, WNV, flu, rsv_a, rsv_b, sars-cov-2 | | consensus | **cpu** | Int | Number of CPUs to allocate to the task | 8 | Optional | CL, ONT | sars-cov-2 | | consensus | **disk_size** | Int | Amount of storage (in GB) to allocate to the task | 100 | Optional | CL, ONT | sars-cov-2 | -| consensus | **docker** | String | The Docker container to use for the task | us-docker.pkg.dev/general-theiagen/staphb/artic:1.2.4-1.12.0 | Optional | CL, ONT | HIV, MPXV, WNV, flu, rsv_a, rsv_b, sars-cov-2 | -| consensus | **medaka_model** | String | In order to obtain the best results, the appropriate model must be set to match the sequencer's basecaller model; this string takes the format of {pore}_{device}_{caller variant}_{caller_version}. See the list of available models in the `artic_consensus` documentation section. See also https://github.com/nanoporetech/medaka?tab=readme-ov-file#models. | r941_min_high_g360 | Optional | CL, ONT | sars-cov-2 | +| consensus | **docker** | String | The Docker container to use for the task | us-docker.pkg.dev/general-theiagen/staphb/artic-ncov2019-epi2me | Optional | ONT | HIV, MPXV, WNV, flu, rsv_a, rsv_b, sars-cov-2 | +| consensus | **medaka_model** | String | In order to obtain the best results, the appropriate model must be set to match the sequencer's basecaller model; this string takes the format of {pore}_{device}_{caller variant}_{caller_version}. See also https://github.com/nanoporetech/medaka?tab=readme-ov-file#models. | r941_min_high_g360 | Optional | CL, ONT | sars-cov-2 | | consensus | **memory** | Int | Amount of memory/RAM (in GB) to allocate to the task | 16 | Optional | CL, ONT | sars-cov-2 | | consensus_qc | **cpu** | Int | Number of CPUs to allocate to the task | 1 | Optional | CL, FASTA, ONT, PE, SE | HIV, MPXV, WNV, rsv_a, rsv_b, sars-cov-2 | | consensus_qc | **disk_size** | Int | Amount of storage (in GB) to allocate to the task | 100 | Optional | CL, FASTA, ONT, PE, SE | HIV, MPXV, WNV, rsv_a, rsv_b, sars-cov-2 | @@ -832,50 +832,6 @@ All input reads are processed through "core tasks" in the TheiaCoV Illumina, ONT !!! info "" Read-trimming is performed on raw read data generated on the ClearLabs instrument and thus not a required step in the TheiaCoV_ClearLabs workflow. - - ??? toggle "Available `medaka` models" - The medaka models available in the default docker container are as follows: - - ``` bash - r103_fast_g507, r103_fast_snp_g507, r103_fast_variant_g507, r103_hac_g507, - r103_hac_snp_g507, r103_hac_variant_g507, r103_min_high_g345, r103_min_high_g360, - r103_prom_high_g360, r103_prom_snp_g3210, r103_prom_variant_g3210, r103_sup_g507, - r103_sup_snp_g507, r103_sup_variant_g507, r1041_e82_260bps_fast_g632, - r1041_e82_260bps_fast_variant_g632, r1041_e82_260bps_hac_g632, - r1041_e82_260bps_hac_v4.0.0, r1041_e82_260bps_hac_v4.1.0, - r1041_e82_260bps_hac_variant_g632, r1041_e82_260bps_hac_variant_v4.1.0, - r1041_e82_260bps_joint_apk_ulk_v5.0.0, r1041_e82_260bps_sup_g632, - r1041_e82_260bps_sup_v4.0.0, r1041_e82_260bps_sup_v4.1.0, - r1041_e82_260bps_sup_variant_g632, r1041_e82_260bps_sup_variant_v4.1.0, - r1041_e82_400bps_fast_g615, r1041_e82_400bps_fast_g632, - r1041_e82_400bps_fast_variant_g615, r1041_e82_400bps_fast_variant_g632, - r1041_e82_400bps_hac_g615, r1041_e82_400bps_hac_g632, r1041_e82_400bps_hac_v4.0.0, - r1041_e82_400bps_hac_v4.1.0, r1041_e82_400bps_hac_v4.2.0, r1041_e82_400bps_hac_v4.3.0, - r1041_e82_400bps_hac_v5.0.0, r1041_e82_400bps_hac_variant_g615, - r1041_e82_400bps_hac_variant_g632, r1041_e82_400bps_hac_variant_v4.1.0, - r1041_e82_400bps_hac_variant_v4.2.0, r1041_e82_400bps_hac_variant_v4.3.0, - r1041_e82_400bps_hac_variant_v5.0.0, r1041_e82_400bps_sup_g615, - r1041_e82_400bps_sup_v4.0.0, r1041_e82_400bps_sup_v4.1.0, r1041_e82_400bps_sup_v4.2.0, - r1041_e82_400bps_sup_v4.3.0, r1041_e82_400bps_sup_v5.0.0, - r1041_e82_400bps_sup_variant_g615, r1041_e82_400bps_sup_variant_v4.1.0, - r1041_e82_400bps_sup_variant_v4.2.0, r1041_e82_400bps_sup_variant_v4.3.0, - r1041_e82_400bps_sup_variant_v5.0.0, r104_e81_fast_g5015, r104_e81_fast_variant_g5015, - r104_e81_hac_g5015, r104_e81_hac_variant_g5015, r104_e81_sup_g5015, r104_e81_sup_g610, - r104_e81_sup_variant_g610, r10_min_high_g303, r10_min_high_g340, r941_e81_fast_g514, - r941_e81_fast_variant_g514, r941_e81_hac_g514, r941_e81_hac_variant_g514, - r941_e81_sup_g514, r941_e81_sup_variant_g514, r941_min_fast_g303, r941_min_fast_g507, - r941_min_fast_snp_g507, r941_min_fast_variant_g507, r941_min_hac_g507, - r941_min_hac_snp_g507, r941_min_hac_variant_g507, r941_min_high_g303, r941_min_high_g330, - r941_min_high_g340_rle, r941_min_high_g344, r941_min_high_g351, r941_min_high_g360, - r941_min_sup_g507, r941_min_sup_snp_g507, r941_min_sup_variant_g507, r941_prom_fast_g303, - r941_prom_fast_g507, r941_prom_fast_snp_g507, r941_prom_fast_variant_g507, - r941_prom_hac_g507, r941_prom_hac_snp_g507, r941_prom_hac_variant_g507, - r941_prom_high_g303, r941_prom_high_g330, r941_prom_high_g344, r941_prom_high_g360, - r941_prom_high_g4011, r941_prom_snp_g303, r941_prom_snp_g322, r941_prom_snp_g360, - r941_prom_sup_g507, r941_prom_sup_snp_g507, r941_prom_sup_variant_g507, - r941_prom_variant_g303, r941_prom_variant_g322, r941_prom_variant_g360, - r941_sup_plant_g610, r941_sup_plant_variant_g610 - ``` General statistics about the assembly are generated with the `consensus_qc` task ([task_assembly_metrics.wdl](https://github.com/theiagen/public_health_bioinformatics/blob/main/tasks/quality_control/basic_statistics/task_assembly_metrics.wdl)). diff --git a/tasks/assembly/task_artic_consensus.wdl b/tasks/assembly/task_artic_consensus.wdl index 8e2d174db..6e38334a1 100644 --- a/tasks/assembly/task_artic_consensus.wdl +++ b/tasks/assembly/task_artic_consensus.wdl @@ -12,7 +12,7 @@ task consensus { Int memory = 16 Int disk_size = 100 String medaka_model = "r941_min_high_g360" - String docker = "us-docker.pkg.dev/general-theiagen/staphb/artic:1.2.4-1.12.0" + String docker = "us-docker.pkg.dev/general-theiagen/staphb/artic-ncov2019-epi2me" } String primer_name = basename(primer_bed) command <<< @@ -61,13 +61,7 @@ task consensus { # version control echo "Medaka via $(artic -v)" | tee VERSION echo "~{primer_name}" | tee PRIMER_NAME - artic minion \ - --medaka \ - --medaka-model ~{medaka_model} \ - --normalise ~{normalise} \ - --threads ~{cpu} \ - --scheme-directory ./primer-schemes \ - --read-file ~{read1} ${scheme_name} ~{samplename} + artic minion --medaka --medaka-model ~{medaka_model} --normalise ~{normalise} --threads ~{cpu} --scheme-directory ./primer-schemes --read-file ~{read1} ${scheme_name} ~{samplename} gunzip -f ~{samplename}.pass.vcf.gz # clean up fasta header diff --git a/tests/workflows/theiacov/test_wf_theiacov_clearlabs.yml b/tests/workflows/theiacov/test_wf_theiacov_clearlabs.yml index 83d78611b..e3896de11 100644 --- a/tests/workflows/theiacov/test_wf_theiacov_clearlabs.yml +++ b/tests/workflows/theiacov/test_wf_theiacov_clearlabs.yml @@ -17,7 +17,7 @@ - wf_theiacov_clearlabs_miniwdl files: - path: miniwdl_run/call-consensus/command - md5sum: b19d5ce485c612036064c07f0a1d6a18 + md5sum: a8e200703dedf732b45dd92b0af15f1c - path: miniwdl_run/call-consensus/inputs.json contains: ["read1", "samplename", "fastq"] - path: miniwdl_run/call-consensus/outputs.json diff --git a/tests/workflows/theiacov/test_wf_theiacov_ont.yml b/tests/workflows/theiacov/test_wf_theiacov_ont.yml index 1348bce94..5d80b1e4f 100644 --- a/tests/workflows/theiacov/test_wf_theiacov_ont.yml +++ b/tests/workflows/theiacov/test_wf_theiacov_ont.yml @@ -31,7 +31,7 @@ - path: miniwdl_run/call-clean_check_reads/work/_miniwdl_inputs/0/artic_ncov2019_ont.fastq md5sum: d41d8cd98f00b204e9800998ecf8427e - path: miniwdl_run/call-consensus/command - md5sum: 362dccda19ecadf377d5cd5872946ddd + md5sum: 056563d18294928fef5238bac7213791 - path: miniwdl_run/call-consensus/inputs.json contains: ["read1_clean", "samplename", "fastq"] - path: miniwdl_run/call-consensus/outputs.json @@ -45,7 +45,7 @@ - path: miniwdl_run/call-consensus/work/REFERENCE_GENOME md5sum: 0e6efd549c8773f9a2f7a3e82619ee61 - path: miniwdl_run/call-consensus/work/VERSION - md5sum: 394e07bc6788e025ac35254411db107c + md5sum: f3528ff85409c70100063c55ad75612b - path: miniwdl_run/call-consensus/work/_miniwdl_inputs/0/artic-v3.primers.bed md5sum: d41d8cd98f00b204e9800998ecf8427e - path: miniwdl_run/call-consensus/work/_miniwdl_inputs/0/artic_ncov2019_ont.fastq @@ -64,6 +64,8 @@ - path: miniwdl_run/call-consensus/work/ont.fastq.gz - path: miniwdl_run/call-consensus/work/ont.medaka.consensus.fasta md5sum: d36b7c665aa4127f0a6e8dbc562eea3e + - path: miniwdl_run/call-consensus/work/ont.merged.gvcf.vcf.gz + - path: miniwdl_run/call-consensus/work/ont.merged.gvcf.vcf.gz.tbi - path: miniwdl_run/call-consensus/work/ont.merged.vcf.gz - path: miniwdl_run/call-consensus/work/ont.merged.vcf.gz.tbi - path: miniwdl_run/call-consensus/work/ont.minion.log.txt @@ -71,15 +73,20 @@ - path: miniwdl_run/call-consensus/work/ont.pass.vcf.gz.tbi - path: miniwdl_run/call-consensus/work/ont.preconsensus.fasta md5sum: b68f4ee4abc9fc16215204d0ff754bb8 + - path: miniwdl_run/call-consensus/work/ont.preconsensus.fasta.fai + md5sum: 4ca7d9fd06b9cdf379c2cf02b9fd6d0e - path: miniwdl_run/call-consensus/work/ont.primers.vcf - path: miniwdl_run/call-consensus/work/ont.primersitereport.txt - md5sum: dab514423a8fb7b59ab7870ad8c3b4cf + md5sum: cffee67632a262eeb947cea9cee0b4c1 - path: miniwdl_run/call-consensus/work/ont.primertrimmed.rg.sorted.bam - path: miniwdl_run/call-consensus/work/ont.primertrimmed.rg.sorted.bam.bai - path: miniwdl_run/call-consensus/work/ont.sorted.bam - path: miniwdl_run/call-consensus/work/ont.sorted.bam.bai - path: miniwdl_run/call-consensus/work/ont.trimmed.rg.sorted.bam - path: miniwdl_run/call-consensus/work/ont.trimmed.rg.sorted.bam.bai + - path: miniwdl_run/call-consensus/work/ont.vcfcheck.log + - path: miniwdl_run/call-consensus/work/ont.vcfreport.txt + md5sum: 69131186223267b3ae6621cb8ef4eecd - path: miniwdl_run/call-consensus/work/primer-schemes/SARS-CoV-2/Vuser/SARS-CoV-2.reference.fasta md5sum: b9b67235a2d9d0b0d7f531166ffefd41 - path: miniwdl_run/call-consensus/work/primer-schemes/SARS-CoV-2/Vuser/SARS-CoV-2.reference.fasta.fai