https://github.com/statgenetics/statgen-courses/wiki
Caution: docker images listed below are obsolete as of April 2024, but kept here for backward compatiblity suport. The updated setup is a single image found at https://github.com/cumc/handson-tutorials/tree/main/setup/docker
Available tutorials (via --tutorial
option of statgen-setup
script) are those with
docker images available as listed here.
Caution: Instructions for developers is obsolete as of April 2024, but kept here for backward compatiblity suport. The updated instructions can be found at https://github.com/cumc/handson-tutorials/blob/main/setup/docker/README.md
docker
folder contains files for docker images to run the statgen course tutorials.handout
folder contains some handouts.src
folder contains utility scripts eg, tools to setup the Jupyter server online.
Software you need to install on your computer are SoS
(simply type pip install sos
to install, or, check out here for alternative installation methods if you have troubles with that command) and docker
.
Additionally to run the course material on your computer (not on cloud VM) you have to put src/statgen-setup
script to your PATH
and change it to executable,
eg, chmod +x ~/bin/statgen-setup
if you put it under ~/bin
which is part of your PATH
. To verify your setup, type:
statgen-setup -h
you should see some meaningful output.
gaow/base-notebook
,
a minimal JupyterHub / SoS Notebook environment for scientific computing, is used to derive
tutorial specific images in this folder.
To build tutorial images and push to dockerhub, eg for tutorial igv
found under docker
folder, please execute command below under the root of this repo (same folder as this README.md
file)
statgen-setup build --tutorial igv
Or, multiple tutorials,
statgen-setup build --tutorial igv vat pseq regression annovar
You can use option --tag
to add version tag to a build, eg, --tag 1.0.0
.
If you run into this error denied: requested access to the resource is denied
please make sure you have push access to dockerhub account statisticalgenetics
.
Please contact Gao Wang for the password to that account and use docker login
command to login from your terminal. Then try build again.
It is possible to additionally customize the docker image when started from JuptyerLab environment, to download the latest version of tutorial notes and deploy small data in the JupyterLab server launched from the docker image. To configure please study this example (which is self-explanary and I'll not elaborate it here).
If your tutorial comes with a large data-set it is not suggested that a setup script is used. Instead, you can still install the pull-tutorial.sh
script and instruct users to type a line of command get-data
from
JupyterLab terminal when they first logged in to the server. See this example for details.
To set it up for selected tutorial(s) on your local computer, for example for vat
and pseq
tutorials,
statgen-setup launch --tutorial vat pseq
After all steps are complete, you check the Jupyter Hub server on your machine:
$ docker ps
CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES
4888c67e9774 vat_hub_user "tini -g -- jupyterh…" 8 seconds ago Up 7 seconds 8888/tcp, 0.0.0.0:8847->8000/tcp vat_hub_user
The 0.0.0.0:8847
is the address to the server (your port number may vary). To view it, simply paste that address to your browser.
The server may take a while to start, depending on if it runs a setup script that could take a bit of time. Files used for the tutorials should have been deployed once the server is ready. If not (and you see empty file list on the left panel of the JupyterLab interface), users should be instructed to manually transfer files to proper folders for use with JupyterLab. More details are documented here.
Say from a VPS service provider (eg, vultr.com) we purchase a Debian based VM droplet (Debian 9 is what I use as I document this). In the root terminal of the VM,
curl -fsSL https://raw.githubusercontent.com/statgenetics/statgen-courses/master/src/vm-setup.sh -o /tmp/vm-setup.sh
bash /tmp/vm-setup.sh
I provide a shortcut to create new users on the cloud:
statgen-setup useradd --my-name student --num-users 10
It will generate a password for the user, add it and print the new user ID and password.
For maintainance, to shutdown all containers and clean up the dangling ones,
statgen-setup clean
This command is only available to root
user. For adminstrators I suggest you run statgen-setup clean
from time to time to maintain the server.
This will terminate running past tutorials in order to free up resources for new tutorials.
Otherwise with too many tutorial containers running the same time on a VM it may run out of memory.
That is, generate https://statgenetics.github.io/statgen-courses/notebooks.html.
To do this you need to have sos
installed on your local computer if you don't already:
pip install sos -U
To generate the website,
./release
To publish the website, simply add contents in docs/
folder to the github repository and push to github.
Commands below will provide a rough conversion from docx
file (doc
files will not work!) to Notebook file:
pandoc -s exercise.docx -t markdown -o exercise.md
notedown exercise.md > exercise.ipynb
The notedown program can be installed via pip install notedown
. Additionally you need to install pandoc
program.
The conversion is just a start point. Manual polishment is still needed after the automatic conversion. Specifically, it will be important to separate codes from text to different Notebook cells, and assign to each cell the approperate kernel if using SoS multi-language Notebook. Command output should also be removed from the text because they will be generated automatically and formatted better, after executing the notebook.