A fast and scalable CUDA implementation to conduct highly parallelized evolutionary tests on large-scale genomic data.
The CATE software is a CUDA based solution to enable rapid processing of large-scale VCF files to conduct a series of six different tests on evolution.
🔵 Here we have provided only a brief overview of CATE's useability.
🟢 Please refer to CATE's wiki to obtain a more detailed understanding of its functionality and usability.
CATE now comes with APOLLO
Apollo is our high-performance viral epidemic simulation platform powered by CATE's architecture.
Apollo is already available in CATE for use. Use the --simulator or -sim command. Documentation on Apollo is available in our wiki, and a user manual has also been provided to get you up and running on our all-new tool.
Apollo's preprint is now available on bioRxiv.
- CUDA capable hardware
- LINUX or UNIX based kernel
- NVIDIA's CUDA toolkit (nvcc compiler)
- C++ compiler (gcc compiler)
CATE can be used on-device via Ananconda or by downloading and building the GitHub repo. It can also be used online via Google Colab.
For the Google Colab notebook please follow the link to CATE on Colab.
To install CATE via Anaconda:
conda install deshan_cate::cate
To ensure successful installation run the following:
CATE -h
Else, if you want to install CATE on-device using the GitHub repo you might have to compile the code using an nvcc compiler. If so execute the following on the terminal:
Download the repository:
git clone "https://github.com/theLongLab/CATE/"
cd CATE/
cuda 11.3.0 or higher
module load cuda/11.3.0
Finally, compile the project:
nvcc -std=c++17 *.cu *.cpp -o "CATE"
To ensure successful installation try running the following:
CATE -h
CATE is a command-line-based software. Its available functions include six different tests on evolution and a series of tools for editing and processing FASTA and VCF files.
The six tests on evolution are:
- Tajima’s D
- Fu and Li's D, D*, F, and F *
- Fay and Wu’s H and E
- McDonald–Kreitman test
- Fixation Index
- Extended Haplotype Homozygosity
CATE comes equipped with Apollo, our viral simulator that spans from network level to individual virion resolution complete with within-host dynamics. Apollo comes with its main simulation function and five additional utility tools.
- Apollo simulator
- Haplotype retriever
- Pedigree retriever
- Segregating sites matcher
- Base substitution model to JSON
- Recombination hotspots to JSON
Currently, the program's executable is called:
Test_Main
To run the software you need a JSON-style parameters file. An example is provided above:
parameters.json.
The parameters file is used to specify all input and output locations as well as the gene list file locations. Each function's execution can be customized individually using the parameters file.
The typical syntax for program execution is as follows (example below shows running the Tajima's function):
program_executable --function parameter_file
program_executable -f parameter_file
Example:
./Test_Main -t parameters.json
The HELP menu will list all available functions and how each function can be executed. It can be accessed by simply typing -h as the function as shown below:
./Test_Main -h
CATE has been successfully published in the journal Methods in Ecology and Evolution (MEE). If you find this framework or the software solution useful in your analyses, please CITE the published article available in MEE, CATE: A fast and scalable CUDA implementation to conduct highly parallelized evolutionary tests on large scale genomic data.
To cite CATE's code please use the Zenodo release:
The details of the citation are listed below:
Perera, D., Reisenhofer, E., Hussein, S., Higgins, E., Huber, C. D., & Long, Q. (2023). CATE: A fast and scalable CUDA implementation to conduct highly parallelized evolutionary tests on large scale genomic data. Methods in Ecology and Evolution, 00, 1–15. https://doi.org/10.1111/2041-210X.14168.
Apollo is currently being submitted for review. Its preprint is available on bioRxiv and the citation details are as follows:
Apollo: A comprehensive GPU-powered within-host simulator for viral evolution and infection dynamics across population, tissue, and cell Deshan Perera, Evan Li, Frank van der Meer, Tarah Lynch, John Gill, Deirdre L Church, Christian D. Huber, Guido van Marle, Alexander Platt, Quan Long bioRxiv 2024.10.07.617101; doi: https://doi.org/10.1101/2024.10.07.617101
To cite CATE's code with Apollo's integration please cite the Zenodo release:
- For CATE please address your correspondence to:
Deshan Perera ([email protected])
Dr. Quan Long ([email protected])
Dr. Christian D. Huber ([email protected])
- For Apollo please address your correspondence to:
Deshan Perera ([email protected])
Dr. Quan Long ([email protected])
Dr. Alexander Platt ([email protected])
MIT License
Copyright (c) 2022 The Long Lab
Permission is hereby granted, free of charge, to any person obtaining a copy of this software and associated documentation files (the "Software"), to deal in the Software without restriction, including without limitation the rights to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of the Software, and to permit persons to whom the Software is furnished to do so, subject to the following conditions:
The above copyright notice and this permission notice shall be included in all copies or substantial portions of the Software.
THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.