Skip to content

Commit

Permalink
Deployed 4be8afb with MkDocs version: 1.6.1
Browse files Browse the repository at this point in the history
  • Loading branch information
Unknown committed Sep 11, 2024
0 parents commit beb10a2
Show file tree
Hide file tree
Showing 545 changed files with 1,618,711 additions and 0 deletions.
Empty file added .nojekyll
Empty file.
6,666 changes: 6,666 additions & 0 deletions 404.html

Large diffs are not rendered by default.

Large diffs are not rendered by default.

7,123 changes: 7,123 additions & 0 deletions account-project-management/accounts-and-access/alcf-passcode-tokens/index.html

Large diffs are not rendered by default.

6,842 changes: 6,842 additions & 0 deletions account-project-management/accounts-and-access/user-account-overview/index.html

Large diffs are not rendered by default.

Large diffs are not rendered by default.

Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.

Large diffs are not rendered by default.

Large diffs are not rendered by default.

Large diffs are not rendered by default.

Large diffs are not rendered by default.

Large diffs are not rendered by default.

Large diffs are not rendered by default.

Large diffs are not rendered by default.

Large diffs are not rendered by default.

Large diffs are not rendered by default.

Large diffs are not rendered by default.

Large diffs are not rendered by default.

Large diffs are not rendered by default.

Large diffs are not rendered by default.

Large diffs are not rendered by default.

6,912 changes: 6,912 additions & 0 deletions account-project-management/allocation-management/overview/index.html

Large diffs are not rendered by default.

Large diffs are not rendered by default.

Binary file not shown.
Binary file not shown.
Binary file not shown.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
6,972 changes: 6,972 additions & 0 deletions account-project-management/project-management/project-reports/index.html

Large diffs are not rendered by default.

7,272 changes: 7,272 additions & 0 deletions account-project-management/project-management/starting-alcf-award/index.html

Large diffs are not rendered by default.

6,831 changes: 6,831 additions & 0 deletions account-project-management/project-management/team-management/index.html

Large diffs are not rendered by default.

6,840 changes: 6,840 additions & 0 deletions ai-testbed/cerebras/customizing-environment/index.html

Large diffs are not rendered by default.

7,150 changes: 7,150 additions & 0 deletions ai-testbed/cerebras/example-programs/index.html

Large diffs are not rendered by default.

Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added ai-testbed/cerebras/files/Trust_ctl.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added ai-testbed/cerebras/files/compile-vs-run.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added ai-testbed/cerebras/files/cs-getting-started.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added ai-testbed/cerebras/files/grafana_ctl.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
6,792 changes: 6,792 additions & 0 deletions ai-testbed/cerebras/getting-started/index.html

Large diffs are not rendered by default.

6,766 changes: 6,766 additions & 0 deletions ai-testbed/cerebras/job-queuing-and-submission/index.html

Large diffs are not rendered by default.

6,885 changes: 6,885 additions & 0 deletions ai-testbed/cerebras/miscellaneous/index.html

Large diffs are not rendered by default.

7,008 changes: 7,008 additions & 0 deletions ai-testbed/cerebras/running-a-model-or-program/index.html

Large diffs are not rendered by default.

6,736 changes: 6,736 additions & 0 deletions ai-testbed/cerebras/system-overview/index.html

Large diffs are not rendered by default.

6,716 changes: 6,716 additions & 0 deletions ai-testbed/cerebras/tunneling-and-forwarding-ports/index.html

Large diffs are not rendered by default.

6,855 changes: 6,855 additions & 0 deletions ai-testbed/data-management/data-management-overview/index.html

Large diffs are not rendered by default.

84 changes: 84 additions & 0 deletions ai-testbed/files/dictionary.txt
Original file line number Diff line number Diff line change
@@ -0,0 +1,84 @@
aitestbed
ALCFUserID
analyser
ANL
arnoldw
AUTOTUNE
Cerebras
conv
cosmictagger
cpus
cuda
cudart
DATADIR
dlerror
dlopen
elif
finetune
flos
gbps
GEMM
Graphcore
graphcore_login
gres
inet
inplace
jsons
kaggle
keras
keygen
keyscan
layernorm
lenet
libcudart
libnvinfer
LOGDIR
logreg
mgmt
mnist
MNIST
modelzoo
nodelist
ntasks
OUTDIR
passcode
petaFLOPS
POPART
POPLIBS
poptorch
popvision
pretrain
pretraining
PYTHONPATH
relu
RELU
resnet
run_unet_256_256_single_4
SambaFlow
sambanova
sbatch
scancel
Slurm
snconfig
snpath
snthreads
sntilestat
snvenv
softmax
squeue
SRAM
srun
tensorrt
tf2tensorrt
TFLOPs
unet
UNet
unet
unet_compile_run_all
Venkat
venv
venvs
vipu
virtualenv
wilsonb
XRDU
64 changes: 64 additions & 0 deletions ai-testbed/files/example-multi-node-programs.sh
Original file line number Diff line number Diff line change
@@ -0,0 +1,64 @@
#! /bin/bash -x
set -e
#
# Usage: ./unet_all.sh 256 256
#
SECONDS=0

# IMage size.
IM=${1}
# Batch Size
BS=${2}
NUM_WORKERS=1
export OMP_NUM_THREADS=16

source /opt/sambaflow/venv/bin/activate
UNET=$(pwd)/unet

echo "Model: UNET"
echo "Date: " $(date +%m/%d/%y)
echo "Time: " $(date +%H:%M)

echo "COMPILE"

# Compile for parallel RDUs
if [ ! -e out/unet_train_${BS}_${IM}_NN/unet_train_${BS}_${IM}_NN.pef ] ; then
python ${UNET}/unet.py compile -b ${BS} --in-channels=3 --in-width=${IM} --in-height=${IM} --enable-conv-tiling --mac-v2 --compiler-configs-file ${UNET}/jsons/compiler_configs/unet_compiler_configs_no_inst.json --pef-name="unet_train_${BS}_${IM}_NN" --data-parallel -ws 2 > compile_${BS}_${IM}_NN.log 2>&1
fi

# Run Multi-Node, Data Parallel
NN=2
echo "RUN"
echo "NN=${NN}"
sbatch --gres=rdu:1 --tasks-per-node 8 --nodes 2 --nodelist sm-02,sm-01 --cpus-per-task=16 ./unet_batch.sh ${NN} ${NUM_WORKERS}
echo "Duration: " $SECONDS

#! /bin/bash -x
set -e
#
# Usage: ./unet_batch.sh 2 1
#
SECONDS=0

# Batch Size
BS=256

# IMage size
IM=256
NN=${1}
NUM_WORKERS=${2}
export OMP_NUM_THREADS=16
DATADIR=/software/sambanova/dataset/kaggle_3m
UNET=$(pwd)/unet
export SAMBA_CCL_USE_PCIE_TRANSPORT=0

# TODO: Update this.
source /opt/sambaflow/venv/bin/activate

echo "Model: UNET_TRAIN"
echo "Date: " $(date +%m/%d/%y)
echo "Time: " $(date +%H:%M)

srun --mpi=pmi2 python ${UNET}/unet_hook.py run --do-train --in-channels=3 --in-width=${IM} --in-height=${IM} --init-features 32 --batch-size=${BS} --epochs 2 --data-dir ${DATADIR} --log-dir log_dir_unet_${NN}_train_kaggle --pef=$(pwd)/out/unet_train_${BS}_${IM}_NN/unet_train_${BS}_${IM}_NN.pef --data-parallel --reduce-on-rdu --num-workers=${NUM_WORKERS}

echo "Duration: " $SECONDS
Binary file added ai-testbed/files/home-cerebras-sambanova.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading

0 comments on commit beb10a2

Please sign in to comment.