Skip to content

Commit

Permalink
339 support seq2seq models (#477)
Browse files Browse the repository at this point in the history
* chore: Ignore VS Code debug config files

Signed-off-by: Dimitris Poulopoulos <[email protected]>

* inference: Upgrade openai client

Upgrade the openai client due to incombatibility with `httpx==0.28.0`.

Refs #466

Signed-off-by: Dimitris Poulopoulos <[email protected]>

* backend: Rename job config args variable

Rename the `eval_config_args` variable to `job_config_args` to make it
more generic.

Signed-off-by: Dimitris Poulopoulos <[email protected]>

* inference: Support Hugging Face models

Support models pulled from Hugging Face Hub, through the
`HuggingFaceModelClient`.

The client expects a model name (i.e., the model repo ID on HF Hub) and
a task (e.g., "summarization"), creates the corresponding pipeline, and
uses it for inference.

Refs #339

Signed-off-by: Dimitris Poulopoulos <[email protected]>

* inference: Extend inference parameter set

Extend the set of parameters that one can pass to create a new inference
job:

* revision: choose the model version (i.e., branch, tag, or commit ID)
* use_fast: whether or not to use a fast tokenizer, if possible
* torch_dtype: model precision (e.g., float16, float32, "auto")
* accelerator: device to use during inference (i.e., "cpu", "cuda", or
  "mps")

Refs #339

Signed-off-by: Dimitris Poulopoulos <[email protected]>

* inference: Validate Hugging Face inference params

Validate the values of the parameters used in HF pipelines:

* Check if the model name is a valid HF repo ID
* Check if the task is a supported task
* Check if the data type is a valid `torch.dtype` value

Refs #339

Signed-off-by: Dimitris Poulopoulos <[email protected]>

* inference: Amend and extend unit tests

Fix the failing unit tests and extend them to cover inference with
Hugging Face models.

Refs #339

Signed-off-by: Dimitris Poulopoulos <[email protected]>

* sdk: Create inference jobs

Support creating distinct inference jobs using the SDK.

Refs #339

Signed-off-by: Dimitris Poulopoulos <[email protected]>

* doc: Fix the quickstart guide

Use `lumigator_schemas` and `JobEvalCreate` in the quickstart guide
instead if just `schemas` and `JobCreate`.

Signed-off-by: Dimitris Poulopoulos <[email protected]>

* doc: Add inference user guide

Add a new user guide to demonstrate how to create and run an inference
job, using the Lumigator SDK and a model from Hugging Face Hub.

Closes #339

Signed-off-by: Dimitris Poulopoulos <[email protected]>

---------

Signed-off-by: Dimitris Poulopoulos <[email protected]>
  • Loading branch information
dpoulopoulos authored Dec 11, 2024
1 parent 99ce0bf commit 30c5ba6
Show file tree
Hide file tree
Showing 20 changed files with 416 additions and 39 deletions.
3 changes: 3 additions & 0 deletions .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -168,6 +168,9 @@ cython_debug/
# option (not recommended) you can uncomment the following to ignore the entire idea folder.
.idea/

# VS Code
.vscode/launch.json

# Ruff
.ruff_cache

Expand Down
12 changes: 5 additions & 7 deletions docs/source/get-started/quickstart.md
Original file line number Diff line number Diff line change
Expand Up @@ -46,7 +46,7 @@ user@host:~/lumigator$ curl -s http://localhost:8000/api/v1/datasets/ \
:sync: tab2
```python
from lumigator_sdk.lumigator import LumigatorClient
from schemas.datasets import DatasetFormat
from lumigator_schemas.datasets import DatasetFormat

dataset_path = 'path/to/dataset.csv'
lm_client = LumigatorClient('localhost:8000')
Expand Down Expand Up @@ -84,7 +84,7 @@ dataset.csv
:sync: tab2
```python
datasets = lm_client.datasets.get_datasets()
print(datasets.items[0].filename)
print(datasets.items[-1].filename)
```
:::

Expand Down Expand Up @@ -151,9 +151,9 @@ user@host:~/lumigator$ curl -s http://localhost:8000/api/v1/jobs/evaluate/ \
:::{tab-item} Python SDK
:sync: tab2
```python
from schemas.jobs import JobType, JobCreate
from lumigator_schemas.jobs import JobType, JobEvalCreate

dataset_id = datasets.items[0].id
dataset_id = datasets.items[-1].id

models = ['hf://facebook/bart-large-cnn',]

Expand All @@ -164,7 +164,7 @@ team_name = "lumigator_enthusiasts"

responses = []
for model in models:
job_args = JobCreate(
job_args = JobEvalCreate(
name=team_name,
description="Test",
model=model,
Expand Down Expand Up @@ -219,8 +219,6 @@ job_id = responses[0].id

job = lm_client.jobs.wait_for_job(job_id) # Create the coroutine object
result = await job # Await the coroutine to get the result

print(result)
```
:::

Expand Down
11 changes: 5 additions & 6 deletions docs/source/index.rst
Original file line number Diff line number Diff line change
Expand Up @@ -31,7 +31,7 @@ Hugging Face and local stores or accessed through APIs. It consists of:
- A database to track platform-level lifecycle, job, and dataset metadata.

.. toctree::
:maxdepth: 2
:maxdepth: 1
:caption: Get Started

get-started/installation
Expand All @@ -46,12 +46,11 @@ Hugging Face and local stores or accessed through APIs. It consists of:
operations-guide/alembic
operations-guide/dev

.. TODO: Add user-guides and examples here.
.. .. toctree::
.. :maxdepth: 2
.. :caption: User Guides
.. toctree::
:maxdepth: 2
:caption: User Guides

.. user-guides/evaluation
user-guides/inference

.. toctree::
:maxdepth: 2
Expand Down
1 change: 0 additions & 1 deletion docs/source/user-guides/evaluation.md

This file was deleted.

110 changes: 110 additions & 0 deletions docs/source/user-guides/inference.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,110 @@
# Running an Inference Job

This guide will walk you through the process of running an inference job using the Lumigator SDK and
a model downloaded from the Hugging Face Hub. The model will generate summaries for a given set of
text data.

```{note}
You can also use the OpenAI GPT family of models or the Mistal API to run an inference job. To do
so, you need to set the appropriate environment variables: `OPENAI_API_KEY` or `MISTRAL_API_KEY`.
Refer to the `.env.example` file in the repository for more details.
```

## What You'll Need

- A running instance of [Lumigator](../get-started/installation.md).

## Procedure

1. Install the Lumigator SDK:

```console
user@host:~/lumigator$ uv pip install -e lumigator/python/mzai/sdk
```

1. Create a new Python file:

```console
user@host:~/lumigator$ touch inference.py
```

1. Add the following code to `inference.py`:

```python
import json
import requests

from lumigator_sdk.lumigator import LumigatorClient
from lumigator_schemas import jobs, datasets


BUCKET = "lumigator-storage"
HOST = "localhost"
LUMIGATOR_PORT = 8000
RAY_PORT = 4566


# Instantiate the Lumigator client
lm_client = LumigatorClient(f"{HOST}:{LUMIGATOR_PORT}")

# Upload a dataset
dataset_path = "lumigator/python/mzai/sample_data/dialogsum_exc.csv"
dataset = lm_client.datasets.create_dataset(
dataset=open(dataset_path, 'rb'),
format=datasets.DatasetFormat.JOB
)

# Create and submit an inference job
name = "bart-summarization-run"
model = "hf://facebook/bart-large-cnn"
task = "summarization"

job_args = jobs.JobInferenceCreate(
name=name,
model=model,
dataset=dataset.id,
task=task,
)

job = lm_client.jobs.create_job(
type=jobs.JobType.INFERENCE, request=job_args)

# Wait for the job to complete
lm_client.jobs.wait_for_job(job.id, poll_wait=10)

# Retrieve the job results
url = f"http://{HOST}:{RAY_PORT}/{BUCKET}/jobs/results/{name}/{job.id}/inference_results.json"
response = requests.get(url=url)

if response.status_code != 200:
raise Exception(f"Failed to retrieve job results: {response.text}")
results = response.json()

# Write the JSON results to a file
with open("inference_results.json", "w") as f:
json.dump(results, f, indent=4)
```

1. Run the script:

```console
user@host:~/lumigator$ uv run python inference.py
```

## Verify

Review the contents of the `inference_results.json` file to ensure that the inference job ran
successfully:

```console
user@host:~/lumigator$ cat inference_results.json | jq
{
"prediction": [
"A man has trouble breathing. He is sent to see a pulmonary specialist. The doctor tests him for asthma. He does not have any allergies that he knows of. He also has a heavy feeling in his chest when he tries to breathe. This happens a lot when he works out, he says.",
...
```

## Next Steps

Congratulations! You have successfully run an inference job using the Lumigator SDK. You can now
use the results to evaluate your model's performance.
20 changes: 19 additions & 1 deletion lumigator/python/mzai/backend/backend/config_templates.py
Original file line number Diff line number Diff line change
Expand Up @@ -66,6 +66,24 @@

# Inference templates

default_infer_template = """{{
"name": "{job_name}/{job_id}",
"dataset": {{ "path": "{dataset_path}" }},
"hf_pipeline": {{
"model_path": "{model_path}",
"task": "{task}",
"accelerator": "{accelerator}",
"revision": "{revision}",
"use_fast": "{use_fast}",
"trust_remote_code": "{trust_remote_code}",
"torch_dtype": "{torch_dtype}"
}},
"job": {{
"max_samples": {max_samples},
"storage_path": "{storage_path}"
}}
}}"""

seq2seq_infer_template = """{{
"name": "{job_name}/{job_id}",
"model": {{ "path": "{model_path}" }},
Expand Down Expand Up @@ -111,7 +129,7 @@

templates = {
JobType.INFERENCE: {
"default": causal_infer_template,
"default": default_infer_template,
"oai://gpt-4o-mini": oai_infer_template,
"oai://gpt-4-turbo": oai_infer_template,
"oai://gpt-3.5-turbo-0125": oai_infer_template,
Expand Down
10 changes: 8 additions & 2 deletions lumigator/python/mzai/backend/backend/services/jobs.py
Original file line number Diff line number Diff line change
Expand Up @@ -143,6 +143,12 @@ def _get_job_params(self, job_type: str, record, request: BaseModel) -> dict:
"job_name": request.name,
"model_path": request.model,
"dataset_path": dataset_s3_path,
"task": request.task,
"accelerator": request.accelerator,
"revision": request.revision,
"use_fast": request.use_fast,
"trust_remote_code": request.trust_remote_code,
"torch_dtype": request.torch_dtype,
"max_samples": request.max_samples,
"storage_path": self.storage_path,
"model_url": model_url,
Expand Down Expand Up @@ -182,7 +188,7 @@ def create_job(self, request: JobEvalCreate | JobInferenceCreate) -> JobResponse
# command parameters provided via command line to the ray job.
# To do this, we use a dict where keys are parameter names as they'd
# appear on the command line and the values are the respective params.
eval_config_args = {
job_config_args = {
"--config": config_template.format(**config_params),
}

Expand All @@ -195,7 +201,7 @@ def create_job(self, request: JobEvalCreate | JobInferenceCreate) -> JobResponse
job_id=record.id,
job_type=job_type,
command=job_settings["command"],
args=eval_config_args,
args=job_config_args,
)

# build runtime ENV for workers
Expand Down
11 changes: 11 additions & 0 deletions lumigator/python/mzai/jobs/inference/inference.py
Original file line number Diff line number Diff line change
Expand Up @@ -11,9 +11,11 @@
from loguru import logger
from model_clients import (
BaseModelClient,
HuggingFaceModelClient,
MistralModelClient,
OpenAIModelClient,
)
from paths import PathPrefix
from tqdm import tqdm


Expand Down Expand Up @@ -101,6 +103,15 @@ def run_inference(config: InferenceJobConfig) -> Path:
# run the openai client
logger.info(f"Using OAI client. Endpoint: {base_url}")
model_client = OpenAIModelClient(base_url, config)
elif config.hf_pipeline:
if config.hf_pipeline.model_path.startswith(PathPrefix.HUGGINGFACE):
logger.info("Using HuggingFace client.")
model_client = HuggingFaceModelClient(config)
output_model_name = config.hf_pipeline.model
else:
raise ValueError("Unsupported model type.")
else:
raise NotImplementedError("Inference pipeline not supported.")

# run inference
output[config.job.output_field] = predict(dataset_iterable, model_client)
Expand Down
Loading

0 comments on commit 30c5ba6

Please sign in to comment.