-
Notifications
You must be signed in to change notification settings - Fork 9
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
* chore: Ignore VS Code debug config files Signed-off-by: Dimitris Poulopoulos <[email protected]> * inference: Upgrade openai client Upgrade the openai client due to incombatibility with `httpx==0.28.0`. Refs #466 Signed-off-by: Dimitris Poulopoulos <[email protected]> * backend: Rename job config args variable Rename the `eval_config_args` variable to `job_config_args` to make it more generic. Signed-off-by: Dimitris Poulopoulos <[email protected]> * inference: Support Hugging Face models Support models pulled from Hugging Face Hub, through the `HuggingFaceModelClient`. The client expects a model name (i.e., the model repo ID on HF Hub) and a task (e.g., "summarization"), creates the corresponding pipeline, and uses it for inference. Refs #339 Signed-off-by: Dimitris Poulopoulos <[email protected]> * inference: Extend inference parameter set Extend the set of parameters that one can pass to create a new inference job: * revision: choose the model version (i.e., branch, tag, or commit ID) * use_fast: whether or not to use a fast tokenizer, if possible * torch_dtype: model precision (e.g., float16, float32, "auto") * accelerator: device to use during inference (i.e., "cpu", "cuda", or "mps") Refs #339 Signed-off-by: Dimitris Poulopoulos <[email protected]> * inference: Validate Hugging Face inference params Validate the values of the parameters used in HF pipelines: * Check if the model name is a valid HF repo ID * Check if the task is a supported task * Check if the data type is a valid `torch.dtype` value Refs #339 Signed-off-by: Dimitris Poulopoulos <[email protected]> * inference: Amend and extend unit tests Fix the failing unit tests and extend them to cover inference with Hugging Face models. Refs #339 Signed-off-by: Dimitris Poulopoulos <[email protected]> * sdk: Create inference jobs Support creating distinct inference jobs using the SDK. Refs #339 Signed-off-by: Dimitris Poulopoulos <[email protected]> * doc: Fix the quickstart guide Use `lumigator_schemas` and `JobEvalCreate` in the quickstart guide instead if just `schemas` and `JobCreate`. Signed-off-by: Dimitris Poulopoulos <[email protected]> * doc: Add inference user guide Add a new user guide to demonstrate how to create and run an inference job, using the Lumigator SDK and a model from Hugging Face Hub. Closes #339 Signed-off-by: Dimitris Poulopoulos <[email protected]> --------- Signed-off-by: Dimitris Poulopoulos <[email protected]>
- Loading branch information
1 parent
99ce0bf
commit 30c5ba6
Showing
20 changed files
with
416 additions
and
39 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file was deleted.
Oops, something went wrong.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,110 @@ | ||
# Running an Inference Job | ||
|
||
This guide will walk you through the process of running an inference job using the Lumigator SDK and | ||
a model downloaded from the Hugging Face Hub. The model will generate summaries for a given set of | ||
text data. | ||
|
||
```{note} | ||
You can also use the OpenAI GPT family of models or the Mistal API to run an inference job. To do | ||
so, you need to set the appropriate environment variables: `OPENAI_API_KEY` or `MISTRAL_API_KEY`. | ||
Refer to the `.env.example` file in the repository for more details. | ||
``` | ||
|
||
## What You'll Need | ||
|
||
- A running instance of [Lumigator](../get-started/installation.md). | ||
|
||
## Procedure | ||
|
||
1. Install the Lumigator SDK: | ||
|
||
```console | ||
user@host:~/lumigator$ uv pip install -e lumigator/python/mzai/sdk | ||
``` | ||
|
||
1. Create a new Python file: | ||
|
||
```console | ||
user@host:~/lumigator$ touch inference.py | ||
``` | ||
|
||
1. Add the following code to `inference.py`: | ||
|
||
```python | ||
import json | ||
import requests | ||
|
||
from lumigator_sdk.lumigator import LumigatorClient | ||
from lumigator_schemas import jobs, datasets | ||
|
||
|
||
BUCKET = "lumigator-storage" | ||
HOST = "localhost" | ||
LUMIGATOR_PORT = 8000 | ||
RAY_PORT = 4566 | ||
|
||
|
||
# Instantiate the Lumigator client | ||
lm_client = LumigatorClient(f"{HOST}:{LUMIGATOR_PORT}") | ||
|
||
# Upload a dataset | ||
dataset_path = "lumigator/python/mzai/sample_data/dialogsum_exc.csv" | ||
dataset = lm_client.datasets.create_dataset( | ||
dataset=open(dataset_path, 'rb'), | ||
format=datasets.DatasetFormat.JOB | ||
) | ||
|
||
# Create and submit an inference job | ||
name = "bart-summarization-run" | ||
model = "hf://facebook/bart-large-cnn" | ||
task = "summarization" | ||
|
||
job_args = jobs.JobInferenceCreate( | ||
name=name, | ||
model=model, | ||
dataset=dataset.id, | ||
task=task, | ||
) | ||
|
||
job = lm_client.jobs.create_job( | ||
type=jobs.JobType.INFERENCE, request=job_args) | ||
|
||
# Wait for the job to complete | ||
lm_client.jobs.wait_for_job(job.id, poll_wait=10) | ||
|
||
# Retrieve the job results | ||
url = f"http://{HOST}:{RAY_PORT}/{BUCKET}/jobs/results/{name}/{job.id}/inference_results.json" | ||
response = requests.get(url=url) | ||
|
||
if response.status_code != 200: | ||
raise Exception(f"Failed to retrieve job results: {response.text}") | ||
results = response.json() | ||
|
||
# Write the JSON results to a file | ||
with open("inference_results.json", "w") as f: | ||
json.dump(results, f, indent=4) | ||
``` | ||
|
||
1. Run the script: | ||
|
||
```console | ||
user@host:~/lumigator$ uv run python inference.py | ||
``` | ||
|
||
## Verify | ||
|
||
Review the contents of the `inference_results.json` file to ensure that the inference job ran | ||
successfully: | ||
|
||
```console | ||
user@host:~/lumigator$ cat inference_results.json | jq | ||
{ | ||
"prediction": [ | ||
"A man has trouble breathing. He is sent to see a pulmonary specialist. The doctor tests him for asthma. He does not have any allergies that he knows of. He also has a heavy feeling in his chest when he tries to breathe. This happens a lot when he works out, he says.", | ||
... | ||
``` | ||
|
||
## Next Steps | ||
|
||
Congratulations! You have successfully run an inference job using the Lumigator SDK. You can now | ||
use the results to evaluate your model's performance. |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oops, something went wrong.