-
Question Validation
QuestionI want to create a single combined index that can either answer from all the files, or when a user wants to focus on single file, I want to be able to answer that. This requires ability to filter on metadata or doc_id while querying. Is there an option to do this? |
Beta Was this translation helpful? Give feedback.
Replies: 6 comments
-
See #6813 |
Beta Was this translation helpful? Give feedback.
-
Thanks @Arafatk this is what I am looking for. Unfortunately this method is not implemented for SimpleVectorStore index
|
Beta Was this translation helpful? Give feedback.
-
@AravAct Sorry I hope that feature is added soon |
Beta Was this translation helpful? Give feedback.
-
Hi, @AravAct! I'm Dosu, and I'm here to help the LlamaIndex team manage their backlog. I wanted to let you know that we are marking this issue as stale. From what I understand, you were asking if there is an option to filter on metadata or doc_id while querying in order to create a single combined index that can answer from all files or focus on a single file. Arafatk suggested a method in another issue, but you pointed out that it is not implemented for the SimpleVectorStore index. Arafatk apologized and expressed hope that the feature will be added soon. Before we close this issue, we wanted to check if it is still relevant to the latest version of the LlamaIndex repository. If it is, please let us know by commenting on the issue. Otherwise, feel free to close the issue yourself or it will be automatically closed in 7 days. Thank you for your understanding and contribution to LlamaIndex! |
Beta Was this translation helpful? Give feedback.
-
Hi, @AravAct! I'm Dosu, and I'm here to help the LlamaIndex team manage their backlog. I wanted to let you know that we are marking this issue as stale. From what I understand, you were asking if there is an option to filter on metadata or doc_id while querying in order to create a single combined index that can answer from all files or focus on a single file. Arafatk suggested a method in another issue, but you pointed out that it is not implemented for the SimpleVectorStore index. Arafatk apologized and expressed hope that the feature will be added soon. Before we close this issue, we wanted to check if it is still relevant to the latest version of the LlamaIndex repository. If it is, please let us know by commenting on the issue. Otherwise, feel free to close the issue yourself or it will be automatically closed in 7 days. Thank you for your understanding and contribution to LlamaIndex! |
Beta Was this translation helpful? Give feedback.
-
🤖 Hello, Yes, there is a feature in the SimpleVectorStore index of LlamaIndex that allows filtering on both metadata and doc_id while querying. This feature enables the creation of a combined index that can answer from all files or a single file as per the user's requirement. Here is an example of how you can use this feature: from llama_index import Document
from llama_index.vector_stores.types import ExactMatchFilter, MetadataFilters
# Insert a document with specific metadata
doc = Document(text="target", metadata={"tag": "target"})
index.insert(doc)
# Create a filter that matches the inserted metadata
filters = MetadataFilters(
filters=[ExactMatchFilter(key="tag", value="target")]
)
# Use the filter in the retriever to retrieve only the documents that match the filter
retriever = index.as_retriever(
similarity_top_k=20,
filters=filters,
)
source_nodes = retriever.retrieve("What did the author do growing up?") This code will retrieve only the documents that have a metadata tag "tag" with the value "target". Please note that the default vector store built into llama-index doesn't support metadata filters. You would have to use something with llama index that supports it to make your life easier (qdrant, chroma, pinecone, weaviate). For multiple values for a metadata key on the filter, it seems like there is a limitation on the vector store. You might need to find a workaround or use a different vector store that supports this feature. You can find more information about this in the following issues and pull requests:
I hope this helps! If you have any other questions, feel free to ask. Best, Sources
This response is meant to be useful and save you time. It is not meant to be a precise solution, but rather a starting point for your own research. Help me be more useful! Please leave a 👍 if this is helpful and 👎 if it is irrelevant. If you want to continue the conversation, start your reply with @dosu-bot. |
Beta Was this translation helpful? Give feedback.
🤖
Hello,
Yes, there is a feature in the SimpleVectorStore index of LlamaIndex that allows filtering on both metadata and doc_id while querying. This feature enables the creation of a combined index that can answer from all files or a single file as per the user's requirement.
Here is an example of how you can use this feature: