-
Notifications
You must be signed in to change notification settings - Fork 94
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Non-reactive chat memory crashes response streaming #1120
Comments
Thanks for reporting. I will have a look next week |
That's a good point. We may have to switch to worker thread when dealing with the memory, or introduce a reactive API. |
IIRC Quarkus has a reactive Redis client, but I have no idea how to use that and keep it compatible with LangChain4j‘s ChatMemorxStore interface at the same time. BTW the same issue as with the chat memory store also occurs when using a RetrievalAugmentor because the embedding store is queried blocking. |
Ah, yes, that's a good one. I encountered the same issue last week and, to work around it, did not use reactive in the Switching to a worker thread for handling memory operations seems a good approach. |
We definitely need to fix this as the current situation is not good at all. @florian-h05 can you attach a small sample I can use to test various ideas against? |
I have updated my first message with a ChatMemoryProvider bean, a config and a Docker compose for the Redis store. Is that enough or should I spinup a small sample project repo? BTW, I am in fact not using |
Please, do. Thanks |
@geoand I have a minimal sample project to reproduce the issue: https://github.com/florian-h05/quarkus-langchain4j-1120 |
🙏🏽 |
I'll have a closer look tomorrow but for the time being any solution we come with won't be pretty |
The simplest way around it for the time being is to add |
I still receive the same exception, even though Quarkus Dev UI confirms that the request should be executed on a worker thread:
|
It works just fine in this simpliefied scenario: diff --git a/src/main/java/com/florianhotze/quarkus/langchain4j/Assistant.java b/src/main/java/com/florianhotze/quarkus/langchain4j/Assistant.java
index c2b81ef..93dd8c9 100644
--- a/src/main/java/com/florianhotze/quarkus/langchain4j/Assistant.java
+++ b/src/main/java/com/florianhotze/quarkus/langchain4j/Assistant.java
@@ -1,13 +1,10 @@
package com.florianhotze.quarkus.langchain4j;
-import dev.langchain4j.service.MemoryId;
-import dev.langchain4j.service.TokenStream;
import dev.langchain4j.service.UserMessage;
import io.quarkiverse.langchain4j.RegisterAiService;
+import io.smallrye.mutiny.Multi;
@RegisterAiService
public interface Assistant {
- TokenStream chat(@UserMessage String prompt);
-
- TokenStream chat(@MemoryId Object memoryId, @UserMessage String prompt);
+ Multi<String> chat(@UserMessage String prompt);
}
diff --git a/src/main/java/com/florianhotze/quarkus/langchain4j/RestResource.java b/src/main/java/com/florianhotze/quarkus/langchain4j/RestResource.java
index 2dc030e..e9a355b 100644
--- a/src/main/java/com/florianhotze/quarkus/langchain4j/RestResource.java
+++ b/src/main/java/com/florianhotze/quarkus/langchain4j/RestResource.java
@@ -1,5 +1,6 @@
package com.florianhotze.quarkus.langchain4j;
+import io.smallrye.common.annotation.Blocking;
import io.smallrye.mutiny.Multi;
import jakarta.inject.Inject;
import jakarta.ws.rs.Consumes;
@@ -37,18 +38,8 @@ class RestResource {
@Content(
mediaType = MediaType.SERVER_SENT_EVENTS,
schema = @Schema(implementation = String.class)))
+ @Blocking
public Multi<String> streamingPrompt(String prompt) {
- Multi<String> sourceMulti =
- Multi.createFrom()
- .emitter(
- emitter ->
- assistant
- .chat(prompt)
- .onNext(emitter::emit)
- .onError(emitter::fail)
- .onComplete(response -> emitter.complete())
- .start());
-
- return RestMulti.fromMultiData(sourceMulti).withDemand(1).status(200).build();
+ return RestMulti.fromMultiData(assistant.chat(prompt)).withDemand(1).status(200).build();
}
}
diff --git a/src/main/resources/application.properties b/src/main/resources/application.properties
index 85ce614..633366c 100644
--- a/src/main/resources/application.properties
+++ b/src/main/resources/application.properties
@@ -1,2 +1,2 @@
-quarkus.langchain4j.memorystore.redis.client-name=chat-memory
-quarkus.redis.chat-memory.hosts=redis://localhost:6379/0
+#quarkus.langchain4j.memorystore.redis.client-name=chat-memory
+#quarkus.redis.chat-memory.hosts=redis://localhost:6379/0 |
I think it's because you added the |
Yes I know, that's my point, that for the time being adding blocking on the SSE endpoint should get over the problem |
I have applied your diff in the reproduce repo, for me it did not fix the issue. I don't understand why not. |
Thanks for pushing your changes. |
When using response streaming in a REST resource, I always receive the following exception when using
quarkus-langchain4j-memory-store-redis
:Full reproducal project can be found at https://github.com/florian-h05/quarkus-langchain4j-1120.
How can I fix that? Is there a way to have a reactive chat memory store?
I am using Quarkus 3.17.0 with Quarkus LangChain4j 0.22.0.
The text was updated successfully, but these errors were encountered: