TESTESTESTESTE!!!!!!!!!!
In pertbio folder run:
python3.6 setup.py install
Only python3.6 supported. Anaconda or pipenv is recommended to create python environment
Now you can test if the installation is successful
import pertbio
pertbio.VERSION
Easily try PertBio online with Binder
- Go to: https://mybinder.org/v2/gh/dfci/CellBox/master
- From the New dropdown, click Terminal
- Run the following command:
python scripts/main.py -config=configs/example.cfg.json
Alternatively, in project folder, do the same command
- node_index.txt: names of each protein/phenotypic node.
- expr_index.txt: information each perturbation condition (also see loo_label.csv).
- expr.csv: Protein expression data from RPPA for the protein nodes and phenotypic node values. Each row is a condition while each column is a node.
- pert.csv: Perturbation strength and target of all perturbation conditions. Used as input for differential equations.
- CellBox is defined in model.py
- dataset factory for random parition and leave one out tasks
- some training util functions in tensorflow
- Make sure to specify the experiment_id and experiment_type
- experiment_id: name of the experiments, would be used to generate results folders
- experiment_type: currently available tasks are {"random partition", "leave one out (w/o single)", "leave one out (w/ single)", "full data", "single to combo"]}
- Different training stages can be specified using stages and sub_stages
The experiment type configuration file is specified by --experiment_config_path OR -config
python scripts/main.py -config=configs/random_partition.cfg.json
Note: always run the script in the root folder.
A random seed can also be assigned by using argument --working_index OR -i
python scripts/main.py -config=configs/random_partition.cfg.json -i=1234
When training with leave-one-out validation, make sure to specify the drug index --drug_index OR -drug to leave out from training.
- You should see a experiment folder generated under results using the date and experiment_id.
- Under experiment folder, you would see different models run with different random seeds
- Under each model folder, you would have:
- record_eval.csv: log file with loss changes and time used.
- random_pos.csv: how the data was split (only for random partitions)
- best.W, best.alpha, best.eps: model parameters snapshot for each training stage
- predicted training set .csv, predicted nodes of training set (average prediction over last 20% of ODE simulation, time derivative at end point, max-min over last 20% ODE simulation)
- best.test_hat: Prediction on test set, using the best model for each stage
- .ckpt files are the final models in tensorflow compatible format.