Mistral 7x8B Instruct served by vllm and used as OpenAIlike - is sending of EOS token </s> required #9990

dinonovak · 2024-01-11T17:21:18Z

dinonovak
Jan 11, 2024

I am using mistral 8x7B served via vllm.
When using it in llama-index with OpenAIlike model definition it looks like it is not finishing messages with token.
DoI need to customize llm definition specifically for mistral/mixtral instruct ?

I am using model with is_chat_model = True and all works, but il looks like all questions are considered part of the same chat.
How can I force end of conversation and do I need to send token somehow?

@dinonovak · 2024-01-11T17:34:48Z

dosubot[bot]
bot Jan 11, 2024

🤖

Hi @dinonovak! 👋 I'm Dosu, an AI bot here to help you with your LlamaIndex concerns while we wait for a human maintainer. I'm here to help solve bugs, answer questions, and assist you in contributing to this project. Let's tackle these tech hiccups together! 🚀

Based on the information available in the LlamaIndex repository, the </s> token doesn't seem to have a specific role within the framework. It's possible that the token is used by the underlying language model (like GPT-3) to signify the end of a text sequence, but this isn't explicitly handled by LlamaIndex.

As for the is_chat_model parameter, it's used to determine whether a given model is a chat-based language model or not. This affects how the model processes and responds to inputs. If is_chat_model is True, the model treats all inputs as part of the same conversation. If you want to force the end of a conversation, you might need to create a new instance of the model or find a way to reset its internal state. However, this isn't directly supported by LlamaIndex.

Regarding the interaction between LlamaIndex and the Mistral 7x8B Instruct served by vllm, I couldn't find any specific information in the repository. It's possible that you might need to customize the LLM definition for Mistral/Mixtral Instruct, but without more information, it's hard to say for sure.

I hope this helps! If you have any more questions or if there's anything else you'd like me to look into, please let me know.

Sources

llama_index/llms/portkey.py

This response is meant to be useful and save you time. It is not meant to be a precise solution, but rather a starting point for your own research.

Help me be more useful! Please leave a 👍 if this is helpful and 👎 if it is irrelevant.

If you want to continue the conversation, start your reply with @dosu-bot.

0 replies

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Mistral 7x8B Instruct served by vllm and used as OpenAIlike - is sending of EOS token </s> required #9990

{{title}}

Replies: 1 comment

{{title}}

Select a reply

Mistral 7x8B Instruct served by vllm and used as OpenAIlike - is sending of EOS token </s> required #9990

dinonovak Jan 11, 2024

Replies: 1 comment

dosubot[bot] bot Jan 11, 2024

Sources

dinonovak
Jan 11, 2024

dosubot[bot]
bot Jan 11, 2024