[Question]: Using Prompt Helper but stil exceeding LLM's maximum context length #8439

jorrgme · 2023-10-13T12:42:40Z

jorrgme
Oct 13, 2023

Question Validation

I have searched both the documentation and discord for an answer.

Question

I've been struggling with llama_index's PromptHelper and can't find a solution anywhere on the Internet.

But first let me talk about my use case:

I'm trying to use Azure OpenAI's GPT-3.5 model to ask the model to make a summary of comments posted by users in an Instagram post, passing in the prompt as a system message all the comments and then asking a question like: "What's the general sentiment in the comments?".

The problem here is that there are so many comments and in many publications I exceed gpt-35-turbo-16k's 16384 tokens maximum context length. Trying to solve this issue I've been working with llama_index's PromptHelper that, if I'm not mistaken helps divide the prompt in chunks in this kind of situations. The problem is that I keep getting the same error no matter in how many ways I change PromptHelper's parameters:

InvalidRequestError: This model's maximum context length is 16384 tokens. However, your messages resulted in 22272 tokens. Please reduce the length of the messages.

I'm pretty sure I'm messing it up in something in my code but can't find where, and llama_index's documentation is not helping me much.

Thanks in advance for any help.

Here is my code, just in case someone has any idea of what I'm doing wrong:

from llama_index.llms import AzureOpenAI
from llama_index.llms.base import ChatMessage
from llama_index.chat_engine import SimpleChatEngine
from llama_index import ServiceContext, PromptHelper
from llama_index.text_splitter import TokenTextSplitter
from llama_index.node_parser import SimpleNodeParser

import pandas as pd
import glob
import os

csv_files = glob.glob('data/csv/*.csv')
df = pd.read_csv(
    csv_files[1],
    sep='|'
)
context_comments = df['COMENTARIO'].to_list()

context_message = f"""Estos son los comentarios que los usuarios han dejado en una publicación de una empresa de supermercados en diferentes redes sociales: \n \
                                        {context_comments}. \n \
                                        Se te van a hacer preguntas sobre los comentarios de la misma, a las que deberás responder con precisión, \
                                        como si estuvieras redactando un informe para los responsables de la publicación, que quieren saber la acogida que ha tenido."""
prefix_messages = [ChatMessage(content=context_message, role="system")]

# Define the LLM
llm = AzureOpenAI(
    model=MODEL_NAME,
    engine=DEPLOYMENT_NAME,
    api_key=AZURE_OPENAI_API_KEY,
    api_base=AZURE_BASE_URL,
    api_type="azure",
    api_version="2023-05-15"
)

node_parser = SimpleNodeParser.from_defaults(
  text_splitter=TokenTextSplitter(chunk_size=512, chunk_overlap=20)
)

# Define prompt helper
prompt_helper = PromptHelper(
  context_window=16384,
  num_output=1500,
  chunk_overlap_ratio=0.2,
  separator="\n"
)

# Create the service context
service_context = ServiceContext.from_defaults(
    llm=llm,
    prompt_helper=prompt_helper,
    node_parser=node_parser,
    embed_model=None,
    chunk_size=512,
    context_window=16384,
    num_output=1500
)

# Use SimpleChatEngine to mix it all
chat_engine = SimpleChatEngine.from_defaults(
    service_context=service_context,
    verbose=True,
    prefix_messages=prefix_messages
)

response = chat_engine.chat("Cual es el sentimiento general de los comentarios de la publicacion?")

print(response)

It throws the same error as mentioned above.

@dosu-bot · 2023-10-13T12:51:24Z

dosubot[bot]
bot Oct 13, 2023

🤖

Hello,

Thank you for reaching out and providing a detailed explanation of your issue. It seems like you're trying to use the PromptHelper to divide your text into chunks that fit within the maximum context length of the GPT-3.5 model. However, you're still encountering an InvalidRequestError indicating that your messages exceed the maximum context length.

Based on the code you've provided, it seems like you're correctly initializing the PromptHelper and ServiceContext. However, the SimpleChatEngine might not be correctly utilizing the PromptHelper to divide your text into chunks.

One thing you could try is to manually divide your text into chunks using the PromptHelper before passing it to the SimpleChatEngine. Here's an example of how you can do this:

# Initialize the PromptHelper with the metadata of the GPT-3.5 model
llm_metadata = LLMMetadata(context_window=16384, num_output=1500, model_name="gpt-3.5-turbo")
prompt_helper = PromptHelper.from_llm_metadata(llm_metadata)

# Assume context_comments is a list of Instagram comments
# Use the repack method to consolidate the comments
repacked_comments = prompt_helper.repack(context_comments)

# Or use the truncate method to truncate the comments
truncated_comments = prompt_helper.truncate(context_comments)

In this example, repacked_comments and truncated_comments will be lists of comments that are guaranteed not to exceed the context window of the GPT-3.5 model. You can then use these comments as input to the SimpleChatEngine.

Also, I found some similar issues in the LlamaIndex repository where users encountered errors because their input exceeded the maximum context length of the model. The suggested solution in these issues was to set the chunk_size_limit explicitly when creating the ServiceContext. Here is the code snippet:

service_context = ServiceContext.from_defaults(
    llm=llm,
    prompt_helper=prompt_helper,
    node_parser=node_parser,
    embed_model=None,
    chunk_size=512,
    context_window=16384,
    num_output=1500,
    chunk_size_limit=1000  # Set the chunk_size_limit explicitly
)

This solution was suggested to handle the error where the code was requesting more tokens than the maximum context length allowed by the model. You can find these issues here: GPTSimpleVectorIndex azure embedding error, Building an index, but prompt is too long, and This model's maximum context length was exceeded..

I hope this helps! Let me know if you have any other questions or if there's anything else I can do to assist you.

Sources

This response is meant to be useful and save you time. It is not meant to be a precise solution, but rather a starting point for your own research.

Help me be more useful! Please leave a 👍 if this is helpful and 👎 if it is irrelevant.

If you want to continue the conversation, start your reply with @dosu-bot.

0 replies

logan-markewich · 2023-10-13T15:53:31Z

logan-markewich
Oct 13, 2023
Maintainer

@jorrgme the SimpleChatEngine does not actually use the prompt helper or node parser

It's just a simple interface to chat back and forth with an LLM. If a single message exceeds the input size, it will throw an error like you are seeing.

What you probably want to do is index your comments and generate a summary

service_context = ...

from llama_index import SimpleDirectoryReader, SummaryIndex, Document

# load comments from directory?
documents = SimpleDirectoryReader("./data").load_data()

# or manually create documents?
documents = [Document(text="..."), ...]

index = SummaryIndex.from_documents(documents, service_context=service_context)

query_engine = index.as_query_engine(response_mode='tree_summarize', use_async=True)

response = query_engine.query("What is the general sentiment across these comments?")

print(str(response))

0 replies

steve36871 · 2023-11-10T06:57:23Z

steve36871
Nov 10, 2023

SimpleChatEngine does not actually use the prompt helper or node parser

In my case, I do not want to create a summary. It would be nice if the SimpleChatEngine did use the prompt helper in my opinion. Any plans for it?

2 replies

logan-markewich Nov 10, 2023
Maintainer

No current plans

logan-markewich Nov 10, 2023
Maintainer

But I welcome a PR if you are passionate about this

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Question]: Using Prompt Helper but stil exceeding LLM's maximum context length #8439

{{title}}

Replies: 3 comments 2 replies

{{title}}

{{title}}

{{title}}

{{title}}

{{title}}

Select a reply

[Question]: Using Prompt Helper but stil exceeding LLM's maximum context length #8439

jorrgme Oct 13, 2023

Question Validation

Question

Replies: 3 comments · 2 replies

dosubot[bot] bot Oct 13, 2023

Sources

logan-markewich Oct 13, 2023 Maintainer

steve36871 Nov 10, 2023

logan-markewich Nov 10, 2023 Maintainer

logan-markewich Nov 10, 2023 Maintainer

jorrgme
Oct 13, 2023

Replies: 3 comments 2 replies

dosubot[bot]
bot Oct 13, 2023

logan-markewich
Oct 13, 2023
Maintainer

steve36871
Nov 10, 2023

logan-markewich Nov 10, 2023
Maintainer

logan-markewich Nov 10, 2023
Maintainer