Skip to content

Commit

Permalink
backend: Extend the models config YAML file
Browse files Browse the repository at this point in the history
Extend the models config YAML file to include model-specific
information.

Add model-specific details, such as the number of parameters, memory
footprint, and tensor type (e.g., FP32, BF16, etc.), along with a set
of default values used by Lumigator when calling the models.

Not all parameters are applicable to each model. For example, Hugging
Face models have default values for parameters like `max_length`, while
API models include parameters like `temperature`. In general, parameters
are model-specific, so there is no universal set of parameter names.

The defaults are inferred either by checking the documentation for each
model or by examining the configuration file on the Hugging Face Hub.

Closes #381

Signed-off-by: Dimitris Poulopoulos <[email protected]>
  • Loading branch information
dpoulopoulos committed Nov 20, 2024
1 parent d954a27 commit 8dba904
Showing 1 changed file with 61 additions and 0 deletions.
61 changes: 61 additions & 0 deletions lumigator/python/mzai/backend/backend/models_config.yaml
Original file line number Diff line number Diff line change
@@ -1,35 +1,96 @@
- name: facebook/bart-large-cnn
uri: hf://facebook/bart-large-cnn
description: BART is a large-sized model fine-tuned on the CNN Daily Mail dataset.
info:
parameters_count: 406M
tensor_type: F32
model_size: 1.63GB
default_parameters:
max_length: 142
min_length: 56
length_penalty: 2.0
early_stopping: true
no_repeat_ngram_size: 3
num_beams: 4

- name: mikeadimech/longformer-qmsum-meeting-summarization
uri: hf://mikeadimech/longformer-qmsum-meeting-summarization
description: Longformer is a transformer model that is capable of processing long sequences.
info:
parameters_count: 162M
tensor_type: F32
model_size: 648MB

- name: mrm8488/t5-base-finetuned-summarize-news
uri: hf://mrm8488/t5-base-finetuned-summarize-news
description: Google's T5 base fine-tuned on News Summary dataset for summarization downstream task.
info:
parameters_count: 223M
tensor_type: F32
model_size: 892MB
default_parameters:
max_length: 200
min_length: 30
length_penalty: 2.0
early_stopping: true
no_repeat_ngram_size: 3
num_beams: 4

- name: Falconsai/text_summarization
uri: hf://Falconsai/text_summarization
description: A fine-tuned variant of the T5 transformer model, designed for the task of text summarization.
info:
parameters_count: 60.5M
tensor_type: F32
model_size: 242MB
default_parameters:
max_length: 200
min_length: 30
length_penalty: 2.0
early_stopping: true
no_repeat_ngram_size: 3
num_beams: 4

- name: mistralai/Mistral-7B-Instruct-v0.3
uri: hf://mistralai/Mistral-7B-Instruct-v0.3
description: Mistral-7B-Instruct-v0.3 is an instruct fine-tuned version of the Mistral-7B-v0.3.
info:
parameters_count: 7.25B
tensor_type: BF16
model_size: 14.5GB

- name: gpt-4o-mini
uri: oai://gpt-4o-mini
description: OpenAI's GPT-4o-mini model.
default_parameters:
temperature: 1.0
top_p: 1.0
max_completion_tokens: 200

- name: gpt-4-turbo
uri: oai://gpt-4-turbo
description: OpenAI's GPT-4 Turbo model.
default_parameters:
temperature: 1.0
top_p: 1.0
max_completion_tokens: 200

- name: open-mistral-7b
uri: mistral://open-mistral-7b
description: Mistral's 7B model.
default_parameters:
temperature: 0.7
top_p: 1.0
max_completion_tokens: 200

- name: mistralai/Mistral-7B-Instruct-v0.2
uri: llamafile://mistralai/Mistral-7B-Instruct-v0.2
description: A llamafile package of Mistral's 7B Instruct model.
info:
parameters_count: 7.24B
tensor_type: BF16
model_size: 14.5GB
default_parameters:
temperature: 0.7
top_p: 1.0
max_completion_tokens: 200

0 comments on commit 8dba904

Please sign in to comment.