-
Notifications
You must be signed in to change notification settings - Fork 275
FAQ
If you are having problems with Kraken 2, please refer to the following FAQ and potential troubleshooting options. If your question is not addressed below, please submit a new github issue.
When submitting a github issue, specify
- the command line used (
kraken2-build
orkraken2
) - the database being used
- the
kraken2
version.
For github issues, it may also be helpful to provide the output of kraken2-inspect
:
kraken2-inspect --db MYDB > k2inspect_output.txt
Note: Use of any older kraken2
versions may contain bugs that were addressed in the newest version. If you are working with an older version, first download the newest code to test.
Updated: 08/07/2023 (page in progress)
What is causing this error message?
- The ncbi website is having problems
- rsync is not available for your server
How do I fix this?
- Please add the
--use-ftp
argument to yourkraken2-build
command.
Sometimes, NCBI releases assembly summaries that have na for FTP path instead of an actual FTP path. As we have yet to modify the download scripts to catch this error, a temporary workaround is as follows:
awk -v FS='\t' '$20 != "na" {print $0}' assembly_summary.txt > new_assembly_summary.txt
cp new_assembly_summary.txt assembly_summary.txt
(Credit to @alxsimon)
Currently, masking is single-threaded, using NCBI's dustmasker
(or for protein dbs segmasker
). While other steps of the build process are multi-threaded, we currently do not provide multi-threading masking.
Error Message: "build_db: OMP only wants you to use 4 threads"
The kraken2-build
script detects how many threads your system can use. If you try to specify more threads than available, this error will print and the build will stop.
Fix: Rerun your kraken2-build
command using the number of threads OMP detected.
We provide a number of pre-built Kraken2 databases. Users should select a database based on the sample and computational resources available.
Users cannot add genomes to any of the pre-built databases as the whole database would need to be rebuilt.
The authors of Kraken2 use the default confidence score of 0 for our own projects, preferring to use the minimizer score for elimination of potential contamination. However, a confidence score may be useful depending on the use case. Please see Wright et al. for more details:
- Wright, Robyn J., Andrè M. Comeau, and Morgan G. I. Langille. 2023. “From Defaults to Databases: Parameter and Database Choice Dramatically Impact the Performance of Metagenomic Taxonomic Classification Tools.” Microbial Genomics 9 (3). https://doi.org/10.1099/mgen.0.000949.
This paper is an independent study of the accuracy of Kraken2 and MetaPhlAn3 using various databases and parameters.