[Question]: How to right structure the data? (Company chat history) #8367

mvcastellari · 2023-07-19T21:55:47Z

mvcastellari
Jul 19, 2023

Question Validation

I have searched both the documentation and discord for an answer.

Question

Hello, we're trying to use our support chat history (from our database) to feed the llama_index, right now I've exported all the data to a JSON file and put it on the "data" folder (millions of lines). But I was wondering how will the AI know what is the customer question vs what is the chat attendant answer? Is there a way to send like an index for each phrase (QUESTION:, ANSWER:)?

logan-markewich · 2023-07-19T22:23:03Z

logan-markewich
Jul 19, 2023
Maintainer

you may need to write your own data loader and add that info manually.

It's pretty easy actually! Just load the JSON in python, iterate over it, and manually create Document objects

Full guide on documents here: https://gpt-index.readthedocs.io/en/latest/core_modules/data_modules/documents_and_nodes/usage_documents.html

0 replies

2023-10-18T16:01:51Z

dosubot[bot]
bot Oct 18, 2023

Hi, @makeholer! I'm Dosu, and I'm helping the LlamaIndex team manage their backlog. I wanted to let you know that we are marking this issue as stale.

From what I understand, the issue is about structuring chat history data for a company and differentiating between customer questions and chat attendant answers. User "logan-markewich" suggested writing a custom data loader and manually adding the necessary information by creating Document objects.

Before we close this issue, we wanted to check with you if it is still relevant to the latest version of the LlamaIndex repository. If it is, please let us know by commenting on the issue. Otherwise, feel free to close the issue yourself or it will be automatically closed in 7 days.

Thank you for your understanding and contribution to LlamaIndex!

0 replies

@dosu-bot · 2023-10-24T06:43:16Z

dosubot[bot]
bot Oct 24, 2023

🤖

Hello,

You're on the right track with your approach to structuring the data for LlamaIndex. To distinguish between the customer's questions and the chat attendant's answers, you can indeed use an index or a role for each phrase. In LlamaIndex, we use the ChatMessage class from the llama_index.llms.base module to hold individual chat messages, which can be either a question from a customer or an answer from a chat attendant.

Each ChatMessage object has a content attribute that holds the text of the message. You can use this attribute to store the text of each question and answer. To distinguish between questions and answers, you can use the role attribute of the ChatMessage class. This attribute can take the value MessageRole.USER for customer's questions and MessageRole.ASSISTANT for the chat attendant's answers.

Here is an example of how you can create ChatMessage objects for a question and an answer:

from llama_index.llms.base import ChatMessage, MessageRole

# Create a ChatMessage for a question
question = ChatMessage(role=MessageRole.USER, content="What is your return policy?")

# Create a ChatMessage for an answer
answer = ChatMessage(role=MessageRole.ASSISTANT, content="Our return policy allows returns within 30 days of purchase.")

You can then add these ChatMessage objects to the chat history, which is a list of ChatMessage objects. The CondenseQuestionChatEngine class has a chat_history property that you can use to access the chat history.

chat_engine = CondenseQuestionChatEngine()  # Replace with actual instance of CondenseQuestionChatEngine
chat_engine.chat_history.append(question)
chat_engine.chat_history.append(answer)

Please note that the CondenseQuestionChatEngine class is a specific implementation of the BaseChatEngine class, so you will need to use a class that derives from BaseChatEngine to create an instance of a chat engine.

You can find more information about the ChatMessage class and its structure in the llama_index/llms/base.py file in the LlamaIndex repository.

I hope this helps! If you have any other questions, feel free to ask.

Best,
Dosu

Sources

This response is meant to be useful and save you time. It is not meant to be a precise solution, but rather a starting point for your own research.

Help me be more useful! Please leave a 👍 if this is helpful and 👎 if it is irrelevant.

If you want to continue the conversation, start your reply with @dosu-bot.

0 replies

@dosu-bot · 2023-10-24T06:45:35Z

dosubot[bot]
bot Oct 24, 2023

🤖

Hello,

You're on the right track with your approach to structuring the data for LlamaIndex. To distinguish between the customer's questions and the chat attendant's answers, you can indeed use an index or a role for each phrase. In LlamaIndex, we use the ChatMessage class from the llama_index.llms.base module to hold individual chat messages, which can be either a question from a customer or an answer from a chat attendant.

Each ChatMessage object has a content attribute that holds the text of the message. You can use this attribute to store the text of each question and answer. To distinguish between questions and answers, you can use the role attribute of the ChatMessage class. This attribute can take the value MessageRole.USER for customer's questions and MessageRole.ASSISTANT for the chat attendant's answers.

Here is an example of how you can create ChatMessage objects for a question and an answer:

from llama_index.llms.base import ChatMessage, MessageRole

# Create a ChatMessage for a question
question = ChatMessage(role=MessageRole.USER, content="What is your return policy?")

# Create a ChatMessage for an answer
answer = ChatMessage(role=MessageRole.ASSISTANT, content="Our return policy allows returns within 30 days of purchase.")

You can then add these ChatMessage objects to the chat history, which is a list of ChatMessage objects. The CondenseQuestionChatEngine class has a chat_history property that you can use to access the chat history.

chat_engine = CondenseQuestionChatEngine()  # Replace with actual instance of CondenseQuestionChatEngine
chat_engine.chat_history.append(question)
chat_engine.chat_history.append(answer)

Please note that the CondenseQuestionChatEngine class is a specific implementation of the BaseChatEngine class, so you will need to use a class that derives from BaseChatEngine to create an instance of a chat engine.

You can find more information about the ChatMessage class and its structure in the llama_index/llms/base.py file in the LlamaIndex repository.

I hope this helps! If you have any other questions, feel free to ask.

Best,
Dosu

Sources

This response is meant to be useful and save you time. It is not meant to be a precise solution, but rather a starting point for your own research.

Help me be more useful! Please leave a 👍 if this is helpful and 👎 if it is irrelevant.

If you want to continue the conversation, start your reply with @dosu-bot.

0 replies

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Question]: How to right structure the data? (Company chat history) #8367

{{title}}

Replies: 4 comments

{{title}}

{{title}}

{{title}}

{{title}}

Select a reply

[Question]: How to right structure the data? (Company chat history) #8367

mvcastellari Jul 19, 2023

Question Validation

Question

Replies: 4 comments

logan-markewich Jul 19, 2023 Maintainer

dosubot[bot] bot Oct 18, 2023

dosubot[bot] bot Oct 24, 2023

Sources

dosubot[bot] bot Oct 24, 2023

Sources

mvcastellari
Jul 19, 2023

logan-markewich
Jul 19, 2023
Maintainer

dosubot[bot]
bot Oct 18, 2023

dosubot[bot]
bot Oct 24, 2023

dosubot[bot]
bot Oct 24, 2023