Name		Name	Last commit message	Last commit date
parent directory ..
HPC Workshop Inference Optimizations.pdf		HPC Workshop Inference Optimizations.pdf
README.md		README.md
create_virtual_env.sh		create_virtual_env.sh
requirements.txt		requirements.txt
run_HF.py		run_HF.py
run_HF.sh		run_HF.sh
run_vllm.py		run_vllm.py
run_vllm.sh		run_vllm.sh
run_vllm_SD.sh		run_vllm_SD.sh
run_vllm_quant.sh		run_vllm_quant.sh

README.md

LLM Inference Optimizations

Time: October 30, 2024.
- 11.30 - 12.30 : LLM Inference Optimizations
Location: TCS 1416
Slack Channel: track-2 : Use to post questions, exact error messages etc.

Setup on Polaris

Get interactive node

qsub -I -l select=1:ngpus=4 -l filesystems=home:eagle:grand -l walltime=1:00:00 -l -q HandsOnHPC -A alcf_training

Clone repo and activate module

$ git clone https://github.com/argonne-lcf/ALCF_Hands_on_HPC_Workshop.git
$ cd ALCF_Hands_on_HPC_Workshop/InferenceOptimizations

$ module use /soft/modulefiles
$ module load conda/2024-10-30-workshop
$ conda activate

Hands-On Examples

We will use LLAMA3-8B model to run inference hands-on examples.

Inference with Huggingface
```
$ bash run_HF.sh
```
This script will run run_HF.py script with correct command line flags.
Inference with vLLM
```
$ bash run_vllm.sh
```
This script will run run_vllm.py script with correct command line flags.
vLLM Quantization Example
```
$ bash run_vllm_quant.sh
```
This script will run run_vllm.py script with correct command line flags.
vLLM SD Example
```
$ bash run_vllm_SD.sh
```
This script will run run_vllm.py script with correct command line flags.

Useful Links

Acknowledgements

Contributors: Krishna Teja Chitty-Venkata and Siddhisanket (Sid) Raskar.

This research used resources of the Argonne Leadership Computing Facility, which is a DOE Office of Science User Facility supported under Contract DE-AC02-06CH11357.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

InferenceOptimizations

InferenceOptimizations

README.md

LLM Inference Optimizations

Setup on Polaris

Hands-On Examples

Useful Links

Acknowledgements

Files

InferenceOptimizations

Directory actions

More options

Directory actions

More options

Latest commit

History

InferenceOptimizations

Folders and files

parent directory

README.md

LLM Inference Optimizations

Setup on Polaris

Hands-On Examples

Useful Links

Acknowledgements