Skip to content

Commit

Permalink
Deployed 2afacf8 with MkDocs version: 1.5.3
Browse files Browse the repository at this point in the history
  • Loading branch information
Unknown committed Feb 12, 2024
0 parents commit b34d044
Show file tree
Hide file tree
Showing 678 changed files with 2,094,756 additions and 0 deletions.
Empty file added .nojekyll
Empty file.
6,604 changes: 6,604 additions & 0 deletions 404.html

Large diffs are not rendered by default.

Large diffs are not rendered by default.

7,062 changes: 7,062 additions & 0 deletions account-project-management/accounts-and-access/alcf-passcode-tokens/index.html

Large diffs are not rendered by default.

6,778 changes: 6,778 additions & 0 deletions account-project-management/accounts-and-access/user-account-overview/index.html

Large diffs are not rendered by default.

Large diffs are not rendered by default.

Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.

Large diffs are not rendered by default.

Large diffs are not rendered by default.

Large diffs are not rendered by default.

Large diffs are not rendered by default.

Large diffs are not rendered by default.

Large diffs are not rendered by default.

Large diffs are not rendered by default.

Large diffs are not rendered by default.

Large diffs are not rendered by default.

Large diffs are not rendered by default.

Large diffs are not rendered by default.

Large diffs are not rendered by default.

Large diffs are not rendered by default.

Large diffs are not rendered by default.

6,852 changes: 6,852 additions & 0 deletions account-project-management/allocation-management/overview/index.html

Large diffs are not rendered by default.

Large diffs are not rendered by default.

Binary file not shown.
Binary file not shown.
Binary file not shown.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
6,908 changes: 6,908 additions & 0 deletions account-project-management/project-management/project-reports/index.html

Large diffs are not rendered by default.

7,209 changes: 7,209 additions & 0 deletions account-project-management/project-management/starting-alcf-award/index.html

Large diffs are not rendered by default.

6,769 changes: 6,769 additions & 0 deletions account-project-management/project-management/team-management/index.html

Large diffs are not rendered by default.

6,779 changes: 6,779 additions & 0 deletions ai-testbed/cerebras/customizing-environment/index.html

Large diffs are not rendered by default.

6,864 changes: 6,864 additions & 0 deletions ai-testbed/cerebras/example-programs/index.html

Large diffs are not rendered by default.

Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added ai-testbed/cerebras/files/Trust_ctl.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added ai-testbed/cerebras/files/compile-vs-run.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added ai-testbed/cerebras/files/cs-getting-started.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added ai-testbed/cerebras/files/grafana_ctl.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
6,728 changes: 6,728 additions & 0 deletions ai-testbed/cerebras/getting-started/index.html

Large diffs are not rendered by default.

6,702 changes: 6,702 additions & 0 deletions ai-testbed/cerebras/job-queuing-and-submission/index.html

Large diffs are not rendered by default.

6,821 changes: 6,821 additions & 0 deletions ai-testbed/cerebras/miscellaneous/index.html

Large diffs are not rendered by default.

6,944 changes: 6,944 additions & 0 deletions ai-testbed/cerebras/running-a-model-or-program/index.html

Large diffs are not rendered by default.

6,672 changes: 6,672 additions & 0 deletions ai-testbed/cerebras/system-overview/index.html

Large diffs are not rendered by default.

6,652 changes: 6,652 additions & 0 deletions ai-testbed/cerebras/tunneling-and-forwarding-ports/index.html

Large diffs are not rendered by default.

6,791 changes: 6,791 additions & 0 deletions ai-testbed/data-management/data-management-overview/index.html

Large diffs are not rendered by default.

84 changes: 84 additions & 0 deletions ai-testbed/files/dictionary.txt
Original file line number Diff line number Diff line change
@@ -0,0 +1,84 @@
aitestbed
ALCFUserID
analyser
ANL
arnoldw
AUTOTUNE
Cerebras
conv
cosmictagger
cpus
cuda
cudart
DATADIR
dlerror
dlopen
elif
finetune
flos
gbps
GEMM
Graphcore
graphcore_login
gres
inet
inplace
jsons
kaggle
keras
keygen
keyscan
layernorm
lenet
libcudart
libnvinfer
LOGDIR
logreg
mgmt
mnist
MNIST
modelzoo
nodelist
ntasks
OUTDIR
passcode
petaFLOPS
POPART
POPLIBS
poptorch
popvision
pretrain
pretraining
PYTHONPATH
relu
RELU
resnet
run_unet_256_256_single_4
SambaFlow
sambanova
sbatch
scancel
Slurm
snconfig
snpath
snthreads
sntilestat
snvenv
softmax
squeue
SRAM
srun
tensorrt
tf2tensorrt
TFLOPs
unet
UNet
unet
unet_compile_run_all
Venkat
venv
venvs
vipu
virtualenv
wilsonb
XRDU
64 changes: 64 additions & 0 deletions ai-testbed/files/example-multi-node-programs.sh
Original file line number Diff line number Diff line change
@@ -0,0 +1,64 @@
#! /bin/bash -x
set -e
#
# Usage: ./unet_all.sh 256 256
#
SECONDS=0

# IMage size.
IM=${1}
# Batch Size
BS=${2}
NUM_WORKERS=1
export OMP_NUM_THREADS=16

source /opt/sambaflow/venv/bin/activate
UNET=$(pwd)/unet

echo "Model: UNET"
echo "Date: " $(date +%m/%d/%y)
echo "Time: " $(date +%H:%M)

echo "COMPILE"

# Compile for parallel RDUs
if [ ! -e out/unet_train_${BS}_${IM}_NN/unet_train_${BS}_${IM}_NN.pef ] ; then
python ${UNET}/unet.py compile -b ${BS} --in-channels=3 --in-width=${IM} --in-height=${IM} --enable-conv-tiling --mac-v2 --compiler-configs-file ${UNET}/jsons/compiler_configs/unet_compiler_configs_no_inst.json --pef-name="unet_train_${BS}_${IM}_NN" --data-parallel -ws 2 > compile_${BS}_${IM}_NN.log 2>&1
fi

# Run Multi-Node, Data Parallel
NN=2
echo "RUN"
echo "NN=${NN}"
sbatch --gres=rdu:1 --tasks-per-node 8 --nodes 2 --nodelist sm-02,sm-01 --cpus-per-task=16 ./unet_batch.sh ${NN} ${NUM_WORKERS}
echo "Duration: " $SECONDS

#! /bin/bash -x
set -e
#
# Usage: ./unet_batch.sh 2 1
#
SECONDS=0

# Batch Size
BS=256

# IMage size
IM=256
NN=${1}
NUM_WORKERS=${2}
export OMP_NUM_THREADS=16
DATADIR=/software/sambanova/dataset/kaggle_3m
UNET=$(pwd)/unet
export SAMBA_CCL_USE_PCIE_TRANSPORT=0

# TODO: Update this.
source /opt/sambaflow/venv/bin/activate

echo "Model: UNET_TRAIN"
echo "Date: " $(date +%m/%d/%y)
echo "Time: " $(date +%H:%M)

srun --mpi=pmi2 python ${UNET}/unet_hook.py run --do-train --in-channels=3 --in-width=${IM} --in-height=${IM} --init-features 32 --batch-size=${BS} --epochs 2 --data-dir ${DATADIR} --log-dir log_dir_unet_${NN}_train_kaggle --pef=$(pwd)/out/unet_train_${BS}_${IM}_NN/unet_train_${BS}_${IM}_NN.pef --data-parallel --reduce-on-rdu --num-workers=${NUM_WORKERS}

echo "Duration: " $SECONDS
Binary file added ai-testbed/files/home-cerebras-sambanova.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading

0 comments on commit b34d044

Please sign in to comment.