Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

339 support seq2seq models #477

Merged
merged 10 commits into from
Dec 11, 2024
Merged

339 support seq2seq models #477

merged 10 commits into from
Dec 11, 2024

Conversation

dpoulopoulos
Copy link
Contributor

@dpoulopoulos dpoulopoulos commented Dec 6, 2024

What's changing

Introduces the HuggingFaceModelClient to support models pulled from Hugging Face Hub.

The client expects a model and a task, and instantiates the corresponding pipeline. For example, if the task is "summarization" the client instantiates the following pipeline:

from transformers import pipeline

pipe = pipeline("summarization", model="...", **kwargs)

During the instantiation of the pipeline we consider several things, such as the dtype for the model weights and the device we will use during inference (i.e., cpu, cuda, or mps).

Closes #339

How to test it

Steps to test the changes:

  1. Run make local-up to deploy Lumigator locally.

  2. Upload a dataset using the OpenAPI docs page (localhost:8000/docs)

  3. Create a new inference job, with the following body:

    {
      "name": "summarization",
      "model": "hf://facebook/nbart-large-cnn",
      "dataset": "...",
      "max_samples": 1,
      "task": "summarization",
      "accelerator": "auto",
      "revision": "main",
      "use_fast": true,
      "trust_remote_code": false,
      "torch_dtype": "auto"
    }

    for translation tasks provide the following body:

    {
      "name": "translation",
      "model": "hf://google-t5/t5-base",
      "dataset": "...",
      "max_samples": 1,
      "task": "translation",
      "accelerator": "auto",
      "revision": "main",
      "use_fast": true,
      "trust_remote_code": false,
      "torch_dtype": "auto"
    }

Additional notes for reviewers

In theory, this implementation can support any arbitrary task, but we're limited by the way we're loading the dataset. For example, "question-answering" tasks need the dataset to be in the following format:

qa_input = {
    'question': 'Why is model conversion important?',
    'context': 'The option to convert models between FARM and transformers gives freedom to the user and let people easily switch between frameworks.'
}

We treat this as a known limitation.

I already...

  • Tested the changes in a working environment to ensure they work as expected
  • Added some tests for any new functionality
  • Updated the documentation (both comments in code and product documentation under /docs)
  • Checked if a (backend) DB migration step was required and included it if required

@dpoulopoulos dpoulopoulos self-assigned this Dec 6, 2024
@github-actions github-actions bot added backend schemas Changes to schemas (which may be public facing) labels Dec 6, 2024
@dpoulopoulos dpoulopoulos added enhancement New feature or request and removed backend schemas Changes to schemas (which may be public facing) labels Dec 6, 2024
@dpoulopoulos dpoulopoulos force-pushed the 339-support-seq2seq-models branch from 07e7a7f to 598f2bd Compare December 9, 2024 11:13
@github-actions github-actions bot added backend schemas Changes to schemas (which may be public facing) labels Dec 9, 2024
@dpoulopoulos dpoulopoulos force-pushed the 339-support-seq2seq-models branch 3 times, most recently from 1226c40 to 8615cd6 Compare December 9, 2024 11:20
@github-actions github-actions bot added the sdk label Dec 9, 2024
@dpoulopoulos dpoulopoulos force-pushed the 339-support-seq2seq-models branch from 1fff7cb to e4cfa5c Compare December 9, 2024 12:39
@github-actions github-actions bot added the documentation Improvements or additions to documentation label Dec 9, 2024
@dpoulopoulos dpoulopoulos marked this pull request as ready for review December 10, 2024 09:29
Copy link
Member

@aittalam aittalam left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you Dimitris! There are a couple of unresolved convos here but I do not think they are blocking, so I am going to pre-approve this and let you close them.

Signed-off-by: Dimitris Poulopoulos <[email protected]>
Upgrade the openai client due to incombatibility with `httpx==0.28.0`.

Refs #466

Signed-off-by: Dimitris Poulopoulos <[email protected]>
Rename the `eval_config_args` variable to `job_config_args` to make it
more generic.

Signed-off-by: Dimitris Poulopoulos <[email protected]>
Support models pulled from Hugging Face Hub, through the
`HuggingFaceModelClient`.

The client expects a model name (i.e., the model repo ID on HF Hub) and
a task (e.g., "summarization"), creates the corresponding pipeline, and
uses it for inference.

Refs #339

Signed-off-by: Dimitris Poulopoulos <[email protected]>
Extend the set of parameters that one can pass to create a new inference
job:

* revision: choose the model version (i.e., branch, tag, or commit ID)
* use_fast: whether or not to use a fast tokenizer, if possible
* torch_dtype: model precision (e.g., float16, float32, "auto")
* accelerator: device to use during inference (i.e., "cpu", "cuda", or
  "mps")

Refs #339

Signed-off-by: Dimitris Poulopoulos <[email protected]>
Validate the values of the parameters used in HF pipelines:

* Check if the model name is a valid HF repo ID
* Check if the task is a supported task
* Check if the data type is a valid `torch.dtype` value

Refs #339

Signed-off-by: Dimitris Poulopoulos <[email protected]>
Fix the failing unit tests and extend them to cover inference with
Hugging Face models.

Refs #339

Signed-off-by: Dimitris Poulopoulos <[email protected]>
Support creating distinct inference jobs using the SDK.

Refs #339

Signed-off-by: Dimitris Poulopoulos <[email protected]>
Use `lumigator_schemas` and `JobEvalCreate` in the quickstart guide
instead if just `schemas` and `JobCreate`.

Signed-off-by: Dimitris Poulopoulos <[email protected]>
Add a new user guide to demonstrate how to create and run an inference
job, using the Lumigator SDK and a model from Hugging Face Hub.

Closes #339

Signed-off-by: Dimitris Poulopoulos <[email protected]>
@dpoulopoulos dpoulopoulos force-pushed the 339-support-seq2seq-models branch from 9ba27c2 to ffe1451 Compare December 11, 2024 11:21
@dpoulopoulos dpoulopoulos merged commit 30c5ba6 into main Dec 11, 2024
9 checks passed
@dpoulopoulos dpoulopoulos deleted the 339-support-seq2seq-models branch December 11, 2024 12:15
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
backend documentation Improvements or additions to documentation enhancement New feature or request schemas Changes to schemas (which may be public facing) sdk
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Add support for seq2seq models with transformers or vLLM as-a-library
3 participants