Azure Databricks tokenizer issue #313

stevenveenma · 2024-11-20T16:50:00Z

Thank you for this promising repository that I would like to make use of. I am bound to use Azure Databricks and have installed the repository there. Then, I configured examples/lightrag_azure_openai_demo.py in a notebook. I was able to solve some issues, but now I am encountering the following error message:

Resposta do llm_model_func: I'm just a computer program, so I don't have feelings, but I'm here and ready to help you! How can I assist you today? Resultado do embedding_func: (1, 1536) Dimensão da embedding: 1536 General error in processing: Error inserting book contents into rag: 'Could not automatically map gpt-4o-mini to a tokenizer. Please use tiktoken.get_encoding to explicitly get the tokenizer you expect.'

The error message is strange because I am using gpt-4o and not gpt-4o-mini. Furthermore, the cause seems to lie in the tokenizer. I tried to resolve the error with the assistance of GPT, but it was unsuccessful. I would appreciate your help with this.

The text was updated successfully, but these errors were encountered:

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Azure Databricks tokenizer issue #313

Azure Databricks tokenizer issue #313

stevenveenma commented Nov 20, 2024

Azure Databricks tokenizer issue #313

Azure Databricks tokenizer issue #313

Comments

stevenveenma commented Nov 20, 2024