Skip to content

Commit

Permalink
Deployed 15697ac with MkDocs version: 1.6.1
Browse files Browse the repository at this point in the history
  • Loading branch information
Unknown committed Nov 5, 2024
0 parents commit 87d53c0
Show file tree
Hide file tree
Showing 571 changed files with 1,770,258 additions and 0 deletions.
Empty file added .nojekyll
Empty file.
6,919 changes: 6,919 additions & 0 deletions 404.html

Large diffs are not rendered by default.

Large diffs are not rendered by default.

7,376 changes: 7,376 additions & 0 deletions account-project-management/accounts-and-access/alcf-passcode-tokens/index.html

Large diffs are not rendered by default.

7,095 changes: 7,095 additions & 0 deletions account-project-management/accounts-and-access/user-account-overview/index.html

Large diffs are not rendered by default.

Large diffs are not rendered by default.

Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.

Large diffs are not rendered by default.

Large diffs are not rendered by default.

Large diffs are not rendered by default.

Large diffs are not rendered by default.

Large diffs are not rendered by default.

Large diffs are not rendered by default.

Large diffs are not rendered by default.

Large diffs are not rendered by default.

Large diffs are not rendered by default.

Large diffs are not rendered by default.

Large diffs are not rendered by default.

Large diffs are not rendered by default.

Large diffs are not rendered by default.

Large diffs are not rendered by default.

7,165 changes: 7,165 additions & 0 deletions account-project-management/allocation-management/overview/index.html

Large diffs are not rendered by default.

Large diffs are not rendered by default.

6,943 changes: 6,943 additions & 0 deletions account-project-management/index.html

Large diffs are not rendered by default.

Binary file not shown.
Binary file not shown.
Binary file not shown.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
7,225 changes: 7,225 additions & 0 deletions account-project-management/project-management/project-reports/index.html

Large diffs are not rendered by default.

7,525 changes: 7,525 additions & 0 deletions account-project-management/project-management/starting-alcf-award/index.html

Large diffs are not rendered by default.

7,084 changes: 7,084 additions & 0 deletions account-project-management/project-management/team-management/index.html

Large diffs are not rendered by default.

7,093 changes: 7,093 additions & 0 deletions ai-testbed/cerebras/customizing-environment/index.html

Large diffs are not rendered by default.

7,403 changes: 7,403 additions & 0 deletions ai-testbed/cerebras/example-programs/index.html

Large diffs are not rendered by default.

Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added ai-testbed/cerebras/files/Trust_ctl.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added ai-testbed/cerebras/files/compile-vs-run.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added ai-testbed/cerebras/files/cs-getting-started.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added ai-testbed/cerebras/files/grafana_ctl.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
7,045 changes: 7,045 additions & 0 deletions ai-testbed/cerebras/getting-started/index.html

Large diffs are not rendered by default.

7,019 changes: 7,019 additions & 0 deletions ai-testbed/cerebras/job-queuing-and-submission/index.html

Large diffs are not rendered by default.

7,138 changes: 7,138 additions & 0 deletions ai-testbed/cerebras/miscellaneous/index.html

Large diffs are not rendered by default.

7,261 changes: 7,261 additions & 0 deletions ai-testbed/cerebras/running-a-model-or-program/index.html

Large diffs are not rendered by default.

6,989 changes: 6,989 additions & 0 deletions ai-testbed/cerebras/system-overview/index.html

Large diffs are not rendered by default.

6,969 changes: 6,969 additions & 0 deletions ai-testbed/cerebras/tunneling-and-forwarding-ports/index.html

Large diffs are not rendered by default.

7,108 changes: 7,108 additions & 0 deletions ai-testbed/data-management/data-management-overview/index.html

Large diffs are not rendered by default.

84 changes: 84 additions & 0 deletions ai-testbed/files/dictionary.txt
Original file line number Diff line number Diff line change
@@ -0,0 +1,84 @@
aitestbed
ALCFUserID
analyser
ANL
arnoldw
AUTOTUNE
Cerebras
conv
cosmictagger
cpus
cuda
cudart
DATADIR
dlerror
dlopen
elif
finetune
flos
gbps
GEMM
Graphcore
graphcore_login
gres
inet
inplace
jsons
kaggle
keras
keygen
keyscan
layernorm
lenet
libcudart
libnvinfer
LOGDIR
logreg
mgmt
mnist
MNIST
modelzoo
nodelist
ntasks
OUTDIR
passcode
petaFLOPS
POPART
POPLIBS
poptorch
popvision
pretrain
pretraining
PYTHONPATH
relu
RELU
resnet
run_unet_256_256_single_4
SambaFlow
sambanova
sbatch
scancel
Slurm
snconfig
snpath
snthreads
sntilestat
snvenv
softmax
squeue
SRAM
srun
tensorrt
tf2tensorrt
TFLOPs
unet
UNet
unet
unet_compile_run_all
Venkat
venv
venvs
vipu
virtualenv
wilsonb
XRDU
64 changes: 64 additions & 0 deletions ai-testbed/files/example-multi-node-programs.sh
Original file line number Diff line number Diff line change
@@ -0,0 +1,64 @@
#! /bin/bash -x
set -e
#
# Usage: ./unet_all.sh 256 256
#
SECONDS=0

# IMage size.
IM=${1}
# Batch Size
BS=${2}
NUM_WORKERS=1
export OMP_NUM_THREADS=16

source /opt/sambaflow/venv/bin/activate
UNET=$(pwd)/unet

echo "Model: UNET"
echo "Date: " $(date +%m/%d/%y)
echo "Time: " $(date +%H:%M)

echo "COMPILE"

# Compile for parallel RDUs
if [ ! -e out/unet_train_${BS}_${IM}_NN/unet_train_${BS}_${IM}_NN.pef ] ; then
python ${UNET}/unet.py compile -b ${BS} --in-channels=3 --in-width=${IM} --in-height=${IM} --enable-conv-tiling --mac-v2 --compiler-configs-file ${UNET}/jsons/compiler_configs/unet_compiler_configs_no_inst.json --pef-name="unet_train_${BS}_${IM}_NN" --data-parallel -ws 2 > compile_${BS}_${IM}_NN.log 2>&1
fi

# Run Multi-Node, Data Parallel
NN=2
echo "RUN"
echo "NN=${NN}"
sbatch --gres=rdu:1 --tasks-per-node 8 --nodes 2 --nodelist sm-02,sm-01 --cpus-per-task=16 ./unet_batch.sh ${NN} ${NUM_WORKERS}
echo "Duration: " $SECONDS

#! /bin/bash -x
set -e
#
# Usage: ./unet_batch.sh 2 1
#
SECONDS=0

# Batch Size
BS=256

# IMage size
IM=256
NN=${1}
NUM_WORKERS=${2}
export OMP_NUM_THREADS=16
DATADIR=/software/sambanova/dataset/kaggle_3m
UNET=$(pwd)/unet
export SAMBA_CCL_USE_PCIE_TRANSPORT=0

# TODO: Update this.
source /opt/sambaflow/venv/bin/activate

echo "Model: UNET_TRAIN"
echo "Date: " $(date +%m/%d/%y)
echo "Time: " $(date +%H:%M)

srun --mpi=pmi2 python ${UNET}/unet_hook.py run --do-train --in-channels=3 --in-width=${IM} --in-height=${IM} --init-features 32 --batch-size=${BS} --epochs 2 --data-dir ${DATADIR} --log-dir log_dir_unet_${NN}_train_kaggle --pef=$(pwd)/out/unet_train_${BS}_${IM}_NN/unet_train_${BS}_${IM}_NN.pef --data-parallel --reduce-on-rdu --num-workers=${NUM_WORKERS}

echo "Duration: " $SECONDS
Binary file added ai-testbed/files/home-cerebras-sambanova.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading

0 comments on commit 87d53c0

Please sign in to comment.