[Question]: How to right structure the data? (Company chat history) #8367
Replies: 4 comments
-
you may need to write your own data loader and add that info manually. It's pretty easy actually! Just load the JSON in python, iterate over it, and manually create Document objects Full guide on documents here: https://gpt-index.readthedocs.io/en/latest/core_modules/data_modules/documents_and_nodes/usage_documents.html |
Beta Was this translation helpful? Give feedback.
-
Hi, @makeholer! I'm Dosu, and I'm helping the LlamaIndex team manage their backlog. I wanted to let you know that we are marking this issue as stale. From what I understand, the issue is about structuring chat history data for a company and differentiating between customer questions and chat attendant answers. User "logan-markewich" suggested writing a custom data loader and manually adding the necessary information by creating Document objects. Before we close this issue, we wanted to check with you if it is still relevant to the latest version of the LlamaIndex repository. If it is, please let us know by commenting on the issue. Otherwise, feel free to close the issue yourself or it will be automatically closed in 7 days. Thank you for your understanding and contribution to LlamaIndex! |
Beta Was this translation helpful? Give feedback.
-
🤖 Hello, You're on the right track with your approach to structuring the data for LlamaIndex. To distinguish between the customer's questions and the chat attendant's answers, you can indeed use an index or a role for each phrase. In LlamaIndex, we use the Each Here is an example of how you can create from llama_index.llms.base import ChatMessage, MessageRole
# Create a ChatMessage for a question
question = ChatMessage(role=MessageRole.USER, content="What is your return policy?")
# Create a ChatMessage for an answer
answer = ChatMessage(role=MessageRole.ASSISTANT, content="Our return policy allows returns within 30 days of purchase.") You can then add these chat_engine = CondenseQuestionChatEngine() # Replace with actual instance of CondenseQuestionChatEngine
chat_engine.chat_history.append(question)
chat_engine.chat_history.append(answer) Please note that the You can find more information about the I hope this helps! If you have any other questions, feel free to ask. Best, Sources
This response is meant to be useful and save you time. It is not meant to be a precise solution, but rather a starting point for your own research. Help me be more useful! Please leave a 👍 if this is helpful and 👎 if it is irrelevant. If you want to continue the conversation, start your reply with @dosu-bot. |
Beta Was this translation helpful? Give feedback.
-
🤖 Hello, You're on the right track with your approach to structuring the data for LlamaIndex. To distinguish between the customer's questions and the chat attendant's answers, you can indeed use an index or a role for each phrase. In LlamaIndex, we use the Each Here is an example of how you can create from llama_index.llms.base import ChatMessage, MessageRole
# Create a ChatMessage for a question
question = ChatMessage(role=MessageRole.USER, content="What is your return policy?")
# Create a ChatMessage for an answer
answer = ChatMessage(role=MessageRole.ASSISTANT, content="Our return policy allows returns within 30 days of purchase.") You can then add these chat_engine = CondenseQuestionChatEngine() # Replace with actual instance of CondenseQuestionChatEngine
chat_engine.chat_history.append(question)
chat_engine.chat_history.append(answer) Please note that the You can find more information about the I hope this helps! If you have any other questions, feel free to ask. Best, Sources
This response is meant to be useful and save you time. It is not meant to be a precise solution, but rather a starting point for your own research. Help me be more useful! Please leave a 👍 if this is helpful and 👎 if it is irrelevant. If you want to continue the conversation, start your reply with @dosu-bot. |
Beta Was this translation helpful? Give feedback.
-
Question Validation
Question
Hello, we're trying to use our support chat history (from our database) to feed the llama_index, right now I've exported all the data to a JSON file and put it on the "data" folder (millions of lines). But I was wondering how will the AI know what is the customer question vs what is the chat attendant answer? Is there a way to send like an index for each phrase (QUESTION:, ANSWER:)?
Beta Was this translation helpful? Give feedback.
All reactions