This R project analyzes the relationship between metabolomics data and drug responses in cancer cell lines using various datasets, including CCLE (Cancer Cell Line Encyclopedia) and CTRP (Cancer Therapeutics Response Portal).
- Data acquisition and preprocessing from multiple sources
- Correlation analysis between metabolite concentrations and drug responses
- Hypergeometric enrichment analysis for protein targets and metabolic pathways
- Visualization of results using various plot types (heatmaps, volcano plots, etc.)
Before running this project, ensure you have R installed (version 4.0.0 or higher) along with the following R packages:
install.packages(c("tidyverse", "data.table", "readxl", "RCurl", "heatmaply", "pals", "scales", "hrbrthemes", "viridis", "forcats", "EnhancedVolcano", "ComplexHeatmap", "ggrepel", "ggpubr", "grid", "gridExtra", "splitstackshape", "fuzzyjoin", "plyr"))
if (!requireNamespace("BiocManager", quietly = TRUE))
install.packages("BiocManager")
BiocManager::install("RCppEigen")
The project consists of a single R script (altogether.R
) that performs the following main steps:
- Data acquisition and preprocessing
- Correlation analysis
- Hypergeometric enrichment analysis
- Visualization of results
-
Clone this repository:
git clone https://github.com/yourusername/metabolomics-drug-response-analysis.git cd metabolomics-drug-response-analysis
-
Open the
altogether.R
script in your R environment (e.g., RStudio). -
Run the script section by section, following the comments that describe each step.
- Downloads and processes CCLE metabolomics data
- Acquires and processes CTRP drug response data
- Merges datasets and performs initial data cleaning
- Calculates correlations between metabolite concentrations and drug responses
- Filters for significant correlations
- Performs enrichment analysis for protein targets and metabolic pathways
- Identifies significantly enriched targets and pathways
- Generates various plots, including:
- Volcano plots by TCGA cancer type
- Heatmaps of correlations between metabolites and drugs
- Scatter plots of drug responses vs. metabolite concentrations
The script generates several output files, including:
- Processed data files (saved as .RData files in the
Data/
directory) - Visualization plots (saved as PDF files in the
Plots/
directory)
You can customize the analysis by modifying parameters such as correlation thresholds, p-value cutoffs, and visualization settings within the script.
Contributions to improve the analysis pipeline are welcome. Please fork the repository and submit a pull request with your changes.
This project is licensed under the MIT License - see the LICENSE file for details.
- This project uses data from the Cancer Cell Line Encyclopedia (CCLE) and the Cancer Therapeutics Response Portal (CTRP).
- Various R packages and libraries are used for data manipulation and visualization. Please cite them appropriately in your research.