339 support seq2seq models #477

dpoulopoulos · 2024-12-06T12:45:17Z

What's changing

Introduces the HuggingFaceModelClient to support models pulled from Hugging Face Hub.

The client expects a model and a task, and instantiates the corresponding pipeline. For example, if the task is "summarization" the client instantiates the following pipeline:

from transformers import pipeline

pipe = pipeline("summarization", model="...", **kwargs)

During the instantiation of the pipeline we consider several things, such as the dtype for the model weights and the device we will use during inference (i.e., cpu, cuda, or mps).

Closes #339

How to test it

Steps to test the changes:

Run make local-up to deploy Lumigator locally.
Upload a dataset using the OpenAPI docs page (localhost:8000/docs)

Create a new inference job, with the following body:

{
  "name": "summarization",
  "model": "hf://facebook/nbart-large-cnn",
  "dataset": "...",
  "max_samples": 1,
  "task": "summarization",
  "accelerator": "auto",
  "revision": "main",
  "use_fast": true,
  "trust_remote_code": false,
  "torch_dtype": "auto"
}

for translation tasks provide the following body:

{
  "name": "translation",
  "model": "hf://google-t5/t5-base",
  "dataset": "...",
  "max_samples": 1,
  "task": "translation",
  "accelerator": "auto",
  "revision": "main",
  "use_fast": true,
  "trust_remote_code": false,
  "torch_dtype": "auto"
}

Additional notes for reviewers

In theory, this implementation can support any arbitrary task, but we're limited by the way we're loading the dataset. For example, "question-answering" tasks need the dataset to be in the following format:

qa_input = {
    'question': 'Why is model conversion important?',
    'context': 'The option to convert models between FARM and transformers gives freedom to the user and let people easily switch between frameworks.'
}

We treat this as a known limitation.

I already...

Tested the changes in a working environment to ensure they work as expected
Added some tests for any new functionality
Updated the documentation (both comments in code and product documentation under /docs)
Checked if a (backend) DB migration step was required and included it if required

lumigator/python/mzai/jobs/inference/model_clients.py

docs/source/user-guides/inference.md

lumigator/python/mzai/backend/backend/config_templates.py

lumigator/python/mzai/jobs/inference/inference_config.py

docs/source/get-started/quickstart.md

lumigator/python/mzai/jobs/inference/inference_config.py

aittalam

Thank you Dimitris! There are a couple of unresolved convos here but I do not think they are blocking, so I am going to pre-approve this and let you close them.

Signed-off-by: Dimitris Poulopoulos <[email protected]>

Upgrade the openai client due to incombatibility with `httpx==0.28.0`. Refs #466 Signed-off-by: Dimitris Poulopoulos <[email protected]>

Rename the `eval_config_args` variable to `job_config_args` to make it more generic. Signed-off-by: Dimitris Poulopoulos <[email protected]>

Support models pulled from Hugging Face Hub, through the `HuggingFaceModelClient`. The client expects a model name (i.e., the model repo ID on HF Hub) and a task (e.g., "summarization"), creates the corresponding pipeline, and uses it for inference. Refs #339 Signed-off-by: Dimitris Poulopoulos <[email protected]>

Extend the set of parameters that one can pass to create a new inference job: * revision: choose the model version (i.e., branch, tag, or commit ID) * use_fast: whether or not to use a fast tokenizer, if possible * torch_dtype: model precision (e.g., float16, float32, "auto") * accelerator: device to use during inference (i.e., "cpu", "cuda", or "mps") Refs #339 Signed-off-by: Dimitris Poulopoulos <[email protected]>

Validate the values of the parameters used in HF pipelines: * Check if the model name is a valid HF repo ID * Check if the task is a supported task * Check if the data type is a valid `torch.dtype` value Refs #339 Signed-off-by: Dimitris Poulopoulos <[email protected]>

Fix the failing unit tests and extend them to cover inference with Hugging Face models. Refs #339 Signed-off-by: Dimitris Poulopoulos <[email protected]>

Support creating distinct inference jobs using the SDK. Refs #339 Signed-off-by: Dimitris Poulopoulos <[email protected]>

Use `lumigator_schemas` and `JobEvalCreate` in the quickstart guide instead if just `schemas` and `JobCreate`. Signed-off-by: Dimitris Poulopoulos <[email protected]>

Add a new user guide to demonstrate how to create and run an inference job, using the Lumigator SDK and a model from Hugging Face Hub. Closes #339 Signed-off-by: Dimitris Poulopoulos <[email protected]>

dpoulopoulos linked an issue Dec 6, 2024 that may be closed by this pull request

Add support for seq2seq models with transformers or vLLM as-a-library #339

Closed

dpoulopoulos self-assigned this Dec 6, 2024

github-actions bot added backend schemas Changes to schemas (which may be public facing) labels Dec 6, 2024

dpoulopoulos added enhancement New feature or request and removed backend schemas Changes to schemas (which may be public facing) labels Dec 6, 2024

javiermtorres reviewed Dec 9, 2024

View reviewed changes

lumigator/python/mzai/jobs/inference/model_clients.py Show resolved Hide resolved

dpoulopoulos force-pushed the 339-support-seq2seq-models branch from 07e7a7f to 598f2bd Compare December 9, 2024 11:13

github-actions bot added backend schemas Changes to schemas (which may be public facing) labels Dec 9, 2024

dpoulopoulos force-pushed the 339-support-seq2seq-models branch 3 times, most recently from 1226c40 to 8615cd6 Compare December 9, 2024 11:20

github-actions bot added the sdk label Dec 9, 2024

dpoulopoulos force-pushed the 339-support-seq2seq-models branch from 1fff7cb to e4cfa5c Compare December 9, 2024 12:39

github-actions bot added the documentation Improvements or additions to documentation label Dec 9, 2024

dpoulopoulos marked this pull request as ready for review December 10, 2024 09:29

dpoulopoulos requested a review from aittalam December 10, 2024 09:30

javiermtorres reviewed Dec 10, 2024

View reviewed changes

docs/source/user-guides/inference.md Outdated Show resolved Hide resolved

javiermtorres reviewed Dec 10, 2024

View reviewed changes

lumigator/python/mzai/backend/backend/config_templates.py Outdated Show resolved Hide resolved

javiermtorres reviewed Dec 10, 2024

View reviewed changes

lumigator/python/mzai/jobs/inference/inference_config.py Outdated Show resolved Hide resolved

aittalam reviewed Dec 10, 2024

View reviewed changes

docs/source/get-started/quickstart.md Show resolved Hide resolved

aittalam reviewed Dec 10, 2024

View reviewed changes

lumigator/python/mzai/jobs/inference/inference_config.py Show resolved Hide resolved

aittalam reviewed Dec 10, 2024

View reviewed changes

lumigator/python/mzai/jobs/inference/inference_config.py Outdated Show resolved Hide resolved

aittalam mentioned this pull request Dec 10, 2024

Make sure params in inference_config.py does not end up with unexpected errors #490

Open

aittalam approved these changes Dec 10, 2024

View reviewed changes

dpoulopoulos added 3 commits December 11, 2024 12:09

chore: Ignore VS Code debug config files

2fe1ccf

Signed-off-by: Dimitris Poulopoulos <[email protected]>

inference: Upgrade openai client

ce52d05

Upgrade the openai client due to incombatibility with `httpx==0.28.0`. Refs #466 Signed-off-by: Dimitris Poulopoulos <[email protected]>

backend: Rename job config args variable

3db14e2

Rename the `eval_config_args` variable to `job_config_args` to make it more generic. Signed-off-by: Dimitris Poulopoulos <[email protected]>

dpoulopoulos added 7 commits December 11, 2024 13:12

inference: Amend and extend unit tests

ba78c49

Fix the failing unit tests and extend them to cover inference with Hugging Face models. Refs #339 Signed-off-by: Dimitris Poulopoulos <[email protected]>

sdk: Create inference jobs

ac89d26

Support creating distinct inference jobs using the SDK. Refs #339 Signed-off-by: Dimitris Poulopoulos <[email protected]>

doc: Fix the quickstart guide

d027b6d

Use `lumigator_schemas` and `JobEvalCreate` in the quickstart guide instead if just `schemas` and `JobCreate`. Signed-off-by: Dimitris Poulopoulos <[email protected]>

doc: Add inference user guide

ffe1451

Add a new user guide to demonstrate how to create and run an inference job, using the Lumigator SDK and a model from Hugging Face Hub. Closes #339 Signed-off-by: Dimitris Poulopoulos <[email protected]>

dpoulopoulos force-pushed the 339-support-seq2seq-models branch from 9ba27c2 to ffe1451 Compare December 11, 2024 11:21

dpoulopoulos merged commit 30c5ba6 into main Dec 11, 2024
9 checks passed

dpoulopoulos deleted the 339-support-seq2seq-models branch December 11, 2024 12:15

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

339 support seq2seq models #477

339 support seq2seq models #477

dpoulopoulos commented Dec 6, 2024 •

edited

Loading

aittalam left a comment

339 support seq2seq models #477

339 support seq2seq models #477

Conversation

dpoulopoulos commented Dec 6, 2024 • edited Loading

What's changing

How to test it

Additional notes for reviewers

I already...

aittalam left a comment

Choose a reason for hiding this comment

dpoulopoulos commented Dec 6, 2024 •

edited

Loading