Skip to content

xinhe-lab/singlegroup_ctwas

 
 

Repository files navigation

ctwas: an R package for integrating molecular QTLs and GWAS for gene discovery

R-CMD-check

Expression quantitative trait loci (eQTLs) have often been used to nominate candidate genes from genome-wide association studies (GWAS). However, commonly used methods are susceptible to false positives largely due to linkage disequilibrium (LD) of eQTLs with causal variants acting on the phenotype directly.

Our method, "causal-TWAS" (cTWAS), addresses this challenge by borrowing ideas from statistical fine-mapping. It is a generalization of methods for transcriptome-wide association studies (TWAS), but when analyzing any gene, it adjusts for other nearby genes and all nearby genetic variants.

Install ctwas

Use "remotes" to install the latest version of ctwas from GitHub:

install.packages("remotes")
remotes::install_github("xinhe-lab/ctwas",ref = "singlegroup")

Currently, ctwas has only been tested on Linux systems.

We recommend installing and running ctwas on a high-performance computing system.

Running ctwas

Running a cTWAS analysis involves four main steps:

  1. Preparing the input data.

  2. Computing associations of genes with the phenotype (Z-scores).

  3. Estimating the model parameters.

  4. Fine-mapping causal genes

The outputs of cTWAS are posterior inclusion probabilities (PIPs) for all variants and genes.

To learn more about the ctwas R package, we recommend starting with this introductory tutorial:

A minimal tutorial of how to run cTWAS without LD

To run the full cTWAS, follow these tutorials:

In addition, we have some useful functions to help run cTWAS, e.g. for creating your own reference LD data:

You can browse source code and report a bug here.

Citing this work

If you find the ctwas package or any of the source code in this repository useful for your work, please cite:

Zhao S, Crouse W, Qian S, Luo K, Stephens M, He X. Adjusting for genetic confounders in transcriptome-wide association studies improves discovery of risk genes of complex traits. Nature Genetics 56, 336–347 (2024). https://doi.org/10.1038/s41588-023-01648-9

Useful resources

We have pre-computed the LD matrices of European samples from UK Biobank. They can be downloaded here.

We have the lists of reference variant information from all the LD matrices in the genome in hg38 and hg19.

cTWAS requires the expression prediction models, or weights, of genes. The pre-computed weights of GTEx expression and splicing traits can be downloaded from PredictDB.

Acknowledgments

We thank the authors of susieR package for using their codes.

Original susieR code obtained by:

git clone [email protected]:stephenslab/susieR.git
git checkout c7934c0

Minor edits to make it accept different prior variances for each variable.

Packages

No packages published

Languages

  • R 100.0%