[Question]: #8412
Replies: 2 comments
-
🤖 Hello, Thank you for reaching out with your question. The 'exclude' parameter in the SimpleDirectoryReader in LlamaIndex is used to specify a list of glob patterns for Python file paths that should be excluded from the reading process. It is not used for excluding metadata. In your case, you are trying to exclude certain metadata from being visible to the LLM response. Unfortunately, the current version of LlamaIndex does not support this feature directly. The 'exclude' parameter won't work for this purpose as it is designed to exclude files, not metadata. However, you can control what metadata is added to the document by modifying the def add_file_metadata(self) -> Dict:
return {'pubdate': doc_metadata[0],
'filename': doc_metadata[1],
'Coauthor': doc_metadata[3]} In this example, I've removed the 'Agency' line. Now, the 'Agency' metadata won't be included in the document, and thus won't be visible to the LLM response. I hope this helps! If you have any other questions, feel free to ask. SourcesThis response is meant to be useful and save you time. It is not meant to be a precise solution, but rather a starting point for your own research. Help me be more useful! Please leave a 👍 if this is helpful and 👎 if it is irrelevant. If you want to continue the conversation, start your reply with @dosu-bot. |
Beta Was this translation helpful? Give feedback.
-
@ddealwis09 I think the proper solution here is setting the excluded metadata keys on the document objects, I don't think simple directory reader sets this
|
Beta Was this translation helpful? Give feedback.
-
Question Validation
Question
For the SimpleDirectoryReader, how do you exclude metadata, using 'exclude' does not appear to work? Thanks!
``
``
Model response still includes "Agency":
responseResponseResponse(response='The agency used for the Poland Pricing Research was BgM.', source_nodes=[NodeWithScore(node=TextNode(id_='0a17f4c4-e080-4a27-8d21-9df1eb18d729', embedding=None, metadata={'pubdate': '2023-03-21', 'filename': 'Winston Poland Pricing Research', 'Agency': 'Gfk', 'Coauthor': '...
Beta Was this translation helpful? Give feedback.
All reactions