Replies: 3 comments 2 replies
-
🤖 Hello, Thank you for reaching out and providing a detailed explanation of your issue. It seems like you're trying to use the Based on the code you've provided, it seems like you're correctly initializing the One thing you could try is to manually divide your text into chunks using the # Initialize the PromptHelper with the metadata of the GPT-3.5 model
llm_metadata = LLMMetadata(context_window=16384, num_output=1500, model_name="gpt-3.5-turbo")
prompt_helper = PromptHelper.from_llm_metadata(llm_metadata)
# Assume context_comments is a list of Instagram comments
# Use the repack method to consolidate the comments
repacked_comments = prompt_helper.repack(context_comments)
# Or use the truncate method to truncate the comments
truncated_comments = prompt_helper.truncate(context_comments) In this example, Also, I found some similar issues in the LlamaIndex repository where users encountered errors because their input exceeded the maximum context length of the model. The suggested solution in these issues was to set the service_context = ServiceContext.from_defaults(
llm=llm,
prompt_helper=prompt_helper,
node_parser=node_parser,
embed_model=None,
chunk_size=512,
context_window=16384,
num_output=1500,
chunk_size_limit=1000 # Set the chunk_size_limit explicitly
) This solution was suggested to handle the error where the code was requesting more tokens than the maximum context length allowed by the model. You can find these issues here: GPTSimpleVectorIndex azure embedding error, Building an index, but prompt is too long, and This model's maximum context length was exceeded.. I hope this helps! Let me know if you have any other questions or if there's anything else I can do to assist you. Sources
This response is meant to be useful and save you time. It is not meant to be a precise solution, but rather a starting point for your own research. Help me be more useful! Please leave a 👍 if this is helpful and 👎 if it is irrelevant. If you want to continue the conversation, start your reply with @dosu-bot. |
Beta Was this translation helpful? Give feedback.
-
@jorrgme the It's just a simple interface to chat back and forth with an LLM. If a single message exceeds the input size, it will throw an error like you are seeing. What you probably want to do is index your comments and generate a summary
|
Beta Was this translation helpful? Give feedback.
-
In my case, I do not want to create a summary. It would be nice if the SimpleChatEngine did use the prompt helper in my opinion. Any plans for it? |
Beta Was this translation helpful? Give feedback.
-
Question Validation
Question
I've been struggling with llama_index's PromptHelper and can't find a solution anywhere on the Internet.
But first let me talk about my use case:
I'm trying to use Azure OpenAI's GPT-3.5 model to ask the model to make a summary of comments posted by users in an Instagram post, passing in the prompt as a system message all the comments and then asking a question like: "What's the general sentiment in the comments?".
The problem here is that there are so many comments and in many publications I exceed gpt-35-turbo-16k's 16384 tokens maximum context length. Trying to solve this issue I've been working with llama_index's PromptHelper that, if I'm not mistaken helps divide the prompt in chunks in this kind of situations. The problem is that I keep getting the same error no matter in how many ways I change PromptHelper's parameters:
I'm pretty sure I'm messing it up in something in my code but can't find where, and llama_index's documentation is not helping me much.
Thanks in advance for any help.
Here is my code, just in case someone has any idea of what I'm doing wrong:
It throws the same error as mentioned above.
Beta Was this translation helpful? Give feedback.
All reactions