-
Notifications
You must be signed in to change notification settings - Fork 0
Running C‐BIRD
Please follow the instructions and tutorials on Terra support pages.
multiBIRD is a wrapper of C-BIRD that makes it easier to use C-BIRD on a local machine. You must create a JSON file to define the paths for the first time. Keep this JSON file for future runs. You need to create a tab-separated text file describing your fastq file paths. So, you will only need to change the sample file for new runs.
1. inputs.json:
{
"multibird.kraken2_database": "path_to_kraken_database",
"multibird.plasmidfinder_database": "path_to_plasmidfinder_database",
"multibird.busco_database": "path_to_busco_database",
"multibird.genome_stats_file": "path_togenome_stats_file",
"multibird.amrfinder_database": "path_to_amrfinder_database",
"multibird.mash_reference": "path_to_reference_mash_sketch",
"multibird.adapters": "path_to_adapters_fasta",
"multibird.target_genes_fasta": "path_to_target_genes_fasta",
"multibird.inputSamplesFile": "path_to_sample_list_file"
}
2. samples.tsv:
sample1 /path_to_sample1_read1.fastq.gz /path_to_sample1_read2.fastq.gz
sample2 /path_to_sample2_read1.fastq.gz /path_to_sample2_read2.fastq.gz
...
Now you are ready to run multiBIRD.
java -Dconfig.file={path_to_cromwell_config} -jar {path_to_cromwell} run /path_to/C-BIRD/workflows/wf_multi_bird.wdl -i inputs.json |& tee multibird.log
When it is finished, you can browse reports in the root path of Cromwell. The run id in the log file is the target folder in the Cromwell root. You can collect the summary reports via bash.
E.g. find . -name '*_html_report.html' -exec cp {} . \;
Note: A batch result can be obtained from the logs, and you can get an Excel file that mimics Terra tables. A Python script example is given in the scripts folder for this purpose. Also, you can find a bash script that wraps everything.
miniwdl run /path_to/C-BIRD/workflows/wf_multi_bird.wdl -i inputs.json
Check miniwdl documentation for more information.
DISCLAIMER: C-BIRD results should NOT be used directly to diagnose, treat, or assess individual patient health or management. Generated data or summarized reports should NOT be delivered to the patient, their care provider, or placed in the patient’s medical record.