diff --git a/docs/ai-testbed/cerebras/customizing-environment.md b/docs/ai-testbed/cerebras/customizing-environment.md index c4ff54c30..89eaf2103 100644 --- a/docs/ai-testbed/cerebras/customizing-environment.md +++ b/docs/ai-testbed/cerebras/customizing-environment.md @@ -7,16 +7,16 @@ ```console #Make your home directory navigable chmod a+xr ~/ -mkdir ~/R_2.0.3 -chmod a+x ~/R_2.0.3/ -cd ~/R_2.0.3 +mkdir ~/R_2.1.1 +chmod a+x ~/R_2.1.1/ +cd ~/R_2.1.1 # Note: "deactivate" does not actually work in scripts. deactivate rm -r venv_cerebras_pt /software/cerebras/python3.8/bin/python3.8 -m venv venv_cerebras_pt source venv_cerebras_pt/bin/activate pip install --upgrade pip -pip install cerebras_pytorch==2.0.2 +pip install cerebras_pytorch==2.1.1 ``` + ```console -source ~/R_2.0.3/venv_cerebras_pt/bin/activate -pip install -r ~/R_2.0.3/modelzoo/requirements.txt +source ~/R_2.1.1/venv_cerebras_pt/bin/activate +pip install -r ~/R_2.1.1/modelzoo/requirements.txt ``` Then ```console -cd ~/R_2.0.3/modelzoo/modelzoo/transformers/pytorch/bert +cd ~/R_2.1.1/modelzoo/modelzoo/transformers/pytorch/bert cp /software/cerebras/dataset/bert_large/bert_large_MSL128_sampleds.yaml configs/bert_large_MSL128_sampleds.yaml export MODEL_DIR=model_dir_bert_large_pytorch if [ -d "$MODEL_DIR" ]; then rm -Rf $MODEL_DIR; fi -python run.py CSX --job_labels name=bert_pt --params configs/bert_large_MSL128_sampleds.yaml --num_workers_per_csx=1 --mode train --model_dir $MODEL_DIR --mount_dirs /home/ /software/ --python_paths /home/$(whoami)/R_2.0.3/modelzoo/ --compile_dir $(whoami) |& tee mytest.log +python run.py CSX --job_labels name=bert_pt --params configs/bert_large_MSL128_sampleds.yaml --num_workers_per_csx=1 --mode train --model_dir $MODEL_DIR --mount_dirs /home/ /software/ --python_paths /home/$(whoami)/R_2.1.1/modelzoo/ --compile_dir $(whoami) |& tee mytest.log ``` -Note: the vocabulary file referenced in `/software/cerebras/dataset/bert_large/bert_large_MSL128_sampleds.yaml` is the same as the one at `/home/$(whoami)/R_2.0.3/modelzoo/modelzoo/transformers/vocab/google_research_uncased_L-12_H-768_A-12.txt`. +Note: the vocabulary file referenced in `/software/cerebras/dataset/bert_large/bert_large_MSL128_sampleds.yaml` is the same as the one at `/home/$(whoami)/R_2.1.1/modelzoo/modelzoo/transformers/vocab/google_research_uncased_L-12_H-768_A-12.txt`. The last parts of the output should resemble the following, with messages about cuda that should be ignored and are not shown. @@ -102,7 +104,7 @@ The last parts of the output should resemble the following, with messages about 2023-11-29 20:13:25,691 INFO: Training completed successfully! 2023-11-29 20:13:25,691 INFO: Processed 1024000 sample(s) in 336.373620536 seconds. ``` - + \ No newline at end of file diff --git a/docs/ai-testbed/cerebras/running-a-model-or-program.md b/docs/ai-testbed/cerebras/running-a-model-or-program.md index 8a3454428..69962817f 100644 --- a/docs/ai-testbed/cerebras/running-a-model-or-program.md +++ b/docs/ai-testbed/cerebras/running-a-model-or-program.md @@ -25,29 +25,29 @@ Follow these instructions to compile and train the `fc_mnist` PyTorch sample. Th First, make a virtual environment for Cerebras for PyTorch. See [Customizing Environments](./customizing-environment.md) for the procedures for making PyTorch virtual environments for Cerebras. -If an environment is made in ```~/R_2.0.3/```, it they would be activated as follows: +If an environment is made in ```~/R_2.1.1/```, it would be activated as follows: ```console -source ~/R_2.0.3/venv_cerebras_pt/bin/activate +source ~/R_2.1.1/venv_cerebras_pt/bin/activate ``` ### Clone the Cerebras modelzoo ```console -mkdir ~/R_2.0.3 -cd ~/R_2.0.3 +mkdir ~/R_2.1.1 +cd ~/R_2.1.1 git clone https://github.com/Cerebras/modelzoo.git cd modelzoo git tag -git checkout Release_2.0.3 +git checkout Release_2.1.1 ``` ## Running a Pytorch sample ### Activate your PyTorch virtual environment, install modelzoo requirements, and change to the working directory ```console -source ~/R_2.0.3/venv_cerebras_pt/bin/activate -pip install -r ~/R_2.0.3/modelzoo/requirements.txt -cd ~/R_2.0.3/modelzoo/modelzoo/fc_mnist/pytorch +source ~/R_2.1.1/venv_cerebras_pt/bin/activate +pip install -r ~/R_2.1.1/modelzoo/requirements.txt +cd ~/R_2.1.1/modelzoo/modelzoo/fc_mnist/pytorch ``` Next, edit configs/params.yaml, making the following changes: @@ -76,7 +76,7 @@ To run the sample: export MODEL_DIR=model_dir # deletion of the model_dir is only needed if sample has been previously run if [ -d "$MODEL_DIR" ]; then rm -Rf $MODEL_DIR; fi -python run.py CSX --job_labels name=pt_smoketest --params configs/params.yaml --num_csx=1 --mode train --model_dir $MODEL_DIR --mount_dirs /home/ /software --python_paths /home/$(whoami)/R_2.0.3/modelzoo --compile_dir /$(whoami) |& tee mytest.log +python run.py CSX --job_labels name=pt_smoketest --params configs/params.yaml --num_csx=1 --mode train --model_dir $MODEL_DIR --mount_dirs /home/ /software --python_paths /home/$(whoami)/R_2.1.1/modelzoo --compile_dir /$(whoami) |& tee mytest.log ``` A successful fc_mnist PyTorch training run should finish with output resembling the following: