You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I have searched both the documentation and discord for an answer.
Question
We recently created an inference endpoint on huggingface, which we use to run a finetuned embedding model on their GPU service. While the endpoint itself functions as intended, the request format it provides is difficult to integrate with our existing codebase. For a more seamless integration, we attempted to utilize the TextEmbeddingsInference class from llama_index.embeddings.text_embeddings_inference, as it would allow for a more compatible fit with our current implementation. Specifically, we created the object using the following setup: embed_model = TextEmbeddingsInference( model_name=model, base_url=API_URL, timeout=60, auth_token=key)
However, when we call embed_model.get_text_embedding("Test"), we encounter a KeyError: 0 exception.
Does anybody meet the same issue or can provide guidance on this issue or suggest any corrections to our approach?
Any assistance in resolving this would be greatly appreciated, as using TextEmbeddingsInference directly would substantially streamline our integration process.
Thank you in advance.
The text was updated successfully, but these errors were encountered:
I found a similar closed issue regarding a problem with the Hugging Face embedding model and LlamaIndex versions [1]. Although it doesn't directly address the KeyError: 0 exception you're encountering, it might be related to compatibility issues between different versions of the library.
For your specific issue, ensure that the request payload format matches the expected structure for the TextEmbeddingsInference class, which should be a JSON object like this:
Make sure that the response from the Hugging Face inference endpoint is structured correctly and that all necessary keys are present to avoid KeyError exceptions. Additionally, verify that you are using the correct dependencies and library versions required for the TextEmbeddingsInference class to function properly with Hugging Face endpoints:
python: version >=3.8.1,<4.0
llama-index-core: version ^0.11.0
huggingface-hub: version >=0.19.0 with the inference extra
If the issue persists, you might want to check the specific configuration settings or parameters required for the TextEmbeddingsInference class, such as model_name, token, timeout, and others, to ensure they are correctly set up [5][6].
@logan-markewich I tried this class too and then i tried to call:
async def get_embedding():
text = "Test"
embedding = await embed_model.aget_text_embedding(text)
print(embedding)
# Run the async function
await get_embedding()
and it returned an array of nan values, it did not work either. So in my understanding text inference do not support embedding model that are deployed on inference endpoint in hugging face.
Question Validation
Question
We recently created an inference endpoint on huggingface, which we use to run a finetuned embedding model on their GPU service. While the endpoint itself functions as intended, the request format it provides is difficult to integrate with our existing codebase. For a more seamless integration, we attempted to utilize the TextEmbeddingsInference class from llama_index.embeddings.text_embeddings_inference, as it would allow for a more compatible fit with our current implementation. Specifically, we created the object using the following setup:
embed_model = TextEmbeddingsInference( model_name=model, base_url=API_URL, timeout=60, auth_token=key)
However, when we call
embed_model.get_text_embedding("Test")
, we encounter aKeyError: 0 exception
.Does anybody meet the same issue or can provide guidance on this issue or suggest any corrections to our approach?
Any assistance in resolving this would be greatly appreciated, as using TextEmbeddingsInference directly would substantially streamline our integration process.
Thank you in advance.
The text was updated successfully, but these errors were encountered: