Using this huggingface model, caulculate the following:
Vary the x axis as sequence length ranginging from (20-2000). Choose your own interval with respect to time bounds.
In the y axis please compute
- Time taken to generate response
- length of tokens generated
- tokens per second
Note: For the y axis please average over 5 generations.
You can choose your own input prompts and store them in the
input_data
directory as excel/ csv/ txt file. Please store the outputs of these prompts and store them in theinput_data
directory as excel/ csv/ txt file.
Feel free to load the model in any format like huggingface Automodel/ Accelerate/ TGI/ tensorrt/ vLLM or any loader of your choice.
- clone this repo
- Maintain the directory structure given for the project
- Use Python 3.7+
- If you need additional imports specify them in
requirements.txt
- Model can be either loaded from huggingface Automodel which auto-downloads to huggignface cache folder or
- downloaded to disk and loaded from a path. In this case, create a model artifact and save it under
/models
. - save your plots to
/plot_data
- None
You can see examples of input data from the Hellaswag Dataset
Timebox this challenge to 2-4 hours. After completing the assignment, please compress whole repo and send it.