Replies: 1 comment
-
🤖 Hi @aiwalter! Great to see you here again. I hope you're doing well. Based on the information from the LlamaIndex repository, there are a couple of ways to add new documents to the index used by a One way is to use the def add_documents(
self, nodes: Sequence[BaseNode], allow_update: bool = True
) -> None:
"""Add a document to the store.
Args:
docs (List[BaseDocument]): documents
allow_update (bool): allow update of docstore from document
"""
for node in nodes:
# NOTE: doc could already exist in the store, but we overwrite it
if not allow_update and self.document_exists(node.node_id):
raise ValueError(
f"node_id {node.node_id} already exists. "
"Set allow_update to True to overwrite."
)
node_key = node.node_id
data = doc_to_json(node)
self._kvstore.put(node_key, data, collection=self._node_collection)
# update doc_collection if needed
metadata = {"doc_hash": node.hash}
if isinstance(node, TextNode) and node.ref_doc_id is not None:
ref_doc_info = self.get_ref_doc_info(node.ref_doc_id) or RefDocInfo()
if node.node_id not in ref_doc_info.node_ids:
ref_doc_info.node_ids.append(node.node_id)
if not ref_doc_info.metadata:
ref_doc_info.metadata = node.metadata or {}
self._kvstore.put(
node.ref_doc_id,
ref_doc_info.to_dict(),
collection=self._ref_doc_collection,
)
# update metadata with map
metadata["ref_doc_id"] = node.ref_doc_id
self._kvstore.put(
node_key, metadata, collection=self._metadata_collection
)
else:
self._kvstore.put(
node_key, metadata, collection=self._metadata_collection
) You can find this method in the llama_index/storage/docstore/keyval_docstore.py file. Another way is to use the def add(
self,
nodes: List[BaseNode],
**add_kwargs: Any,
) -> List[str]:
"""Add nodes to index."""
for node in nodes:
self._data.embedding_dict[node.node_id] = node.get_embedding()
self._data.text_id_to_ref_doc_id[node.node_id] = node.ref_doc_id or "None"
metadata = node_to_metadata_dict(
node, remove_text=True, flat_metadata=False
)
metadata.pop("_node_content", None)
self._data.metadata_dict[node.node_id] = metadata
return [node.node_id for node in nodes] You can find this method in the llama_index/vector_stores/simple.py file. I hope this helps! If you have any other questions, feel free to ask. Sources
This response is meant to be useful and save you time. It is not meant to be a precise solution, but rather a starting point for your own research. Help me be more useful! Please leave a 👍 if this is helpful and 👎 if it is irrelevant. If you want to continue the conversation, start your reply with @dosu-bot. |
Beta Was this translation helpful? Give feedback.
-
Hi together,
what is the best way to add more documents to the index that is used by a
QueryEngine
?Fore example I have this
Now I want to add more documents to the
index
, so I doNow how to best handover this updated index to the query engine without the need to initialize it again?
Beta Was this translation helpful? Give feedback.
All reactions