Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Didn't extract any relationships with gpt_4o_mini_complete, working with gpt_4o_mini_complete #301

Open
rcoundon opened this issue Nov 18, 2024 · 6 comments

Comments

@rcoundon
Copy link

rcoundon commented Nov 18, 2024

When using gpt_4o_complete to create the knowledge graph I'm seeing the warning:
WARNING:lightrag:Didn't extract any relationships, maybe your LLM is not working

Instantiation looks like this:

rag = LightRAG(
            working_dir=working_dir,
            llm_model_func=gpt_4o_complete, 
            graph_storage="Neo4JStorage",
            log_level="INFO",
        )

I don't see the same warning when using gpt_4o_mini_complete

The app is creating a knowledge graph for chunks of some markdown files (originally converted from PDF)
Any thoughts on what could be causing this?

@rcoundon
Copy link
Author

I'm not sure it's related, but when using the mini there is still a warning but it's seemingly unrelated:

WARNING:neo4j.notifications:Received notification from DBMS server: {severity: WARNING} {code: Neo.ClientNotification.Statement.UnknownLabelWarning} {category: UNRECOGNIZED} {title: The provided label is not in the database.} {description: One of the labels in your query is not available in the database, make sure you didn't misspell it or that the label is available when you run this statement in your application (the missing label name is: NCRNCSNCT)} {position: line: 1, column: 10, offset: 9} for query: 'MATCH (n:NCRNCSNCT) RETURN n'

I'm not actually issuing queries at this point, just creating the KGs for my docs.

@LarFii
Copy link
Collaborator

LarFii commented Nov 19, 2024

You should check the output from the LLM during the extraction process. You can directly review the cache files to see if the output is as expected. This can help determine whether the issue lies in the LLM's response or elsewhere in the pipeline.

@rcoundon
Copy link
Author

You should check the output from the LLM during the extraction process. You can directly review the cache files to see if the output is as expected. This can help determine whether the issue lies in the LLM's response or elsewhere in the pipeline.

Ok, thanks, I'll take a look and report back

@rcoundon
Copy link
Author

I've had a look at kv_store_llm_response_cache.json and vdb_relationships.json and there's a fair amount of data there but I'm not sure what I'm looking at.

However, on the response cache I see this:

Given the technical and largely descriptive nature of the provided text, it's challenging to identify traditional entities like organizations, persons, geolocations, or events as defined by the constraints. However, I can focus on certain elements like terms related to the overall process described in the content:

  1. Entity Identification:
    None of the traditional entities (organization, person, geo, event) are clearly specified in the text provided.

  2. Relationships:
    Lacking traditional entities, there are no clear relationships to be defined among any potential entities.

However, focusing purely on elements present within the text, I can attempt to identify concepts or technical terms that may act as placeholders:

("entity"<|>"Component Mounting"<|>"event"<|>"The process of fixing components to the wall across various languages, depicted through diagrams and imagery.")##
("entity"<|>"Suction"<|>"event"<|>"Details about the aspiration or suction process via different alignments, like light shaft.")##
("entity"<|>"Aspiration via Light Shaft"<|>"event"<|>"Technical specifications regarding how aspiration is carried out using a light shaft in the mounting process.")##
("entity"<|>"Technical Diagrams"<|>"event"<|>"Imagery used to illustrate the process of component mounting and air control techniques.")##

Since the text does not contain clear, traditional entities, and relationships, the extraction remains limited to the terms and processes identified in the text. If there are further specific details or entities within additional context or a different section of text, please provide more clarity to enable a more refined extraction.

Which I suspect is the source of the warning I reported. Would you agree? If so, is there a way to guide the LLM when initiating this process on how to establish entities and relationships?

@LarFii
Copy link
Collaborator

LarFii commented Nov 21, 2024

Yes, I think that is the root cause of the issue. Modifying the entity types in prompt.py might help resolve it.

@GaryDean
Copy link

I came across this issue while processing a large number of files. The warning message WARNING:lightrag:Didn't extract any relationships, maybe your LLM is not working only ever occurred with very small files (<50 bytes) that, in fact, contained no relevant information. Solution was to remove such files.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants