-
Notifications
You must be signed in to change notification settings - Fork 877
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat: Add afterCompletion
callback option for runTools
to enable easily building multi-model / multi-agent flows
#1064
base: master
Are you sure you want to change the base?
Conversation
@@ -119,19 +120,19 @@ export class Completions extends APIResource { | |||
runTools< | |||
Params extends ChatCompletionToolRunnerParams<any>, | |||
ParsedT = ExtractParsedContentFromParams<Params>, | |||
>(body: Params, options?: Core.RequestOptions): ChatCompletionRunner<ParsedT>; | |||
>(body: Params, options?: RunnerOptions): ChatCompletionRunner<ParsedT>; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This type has been wrong such that it causes a TS error when you try to pass maxCompletions
. Will Stainless auto-gen remove my change? If so, what's the right way to fix this? It's been driving me nuts 😅
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah thanks for fixing! No the generator will leave this change in :)
Would you mind opening a separate PR with this change so we can get it merged faster?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good call! Will update now...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
afterCompletion
callback option for runTools
to enable easily building multi-model / multi-agent flowsafterCompletion
callback option for runTools
to enable easily building multi-model / multi-agent flows
Thanks so much for the detailed PR! I'll take this back to the team. |
afterCompletion: async () => { | ||
if (!shouldInjectMessage) { | ||
runner._addMessage({ | ||
role: 'system', |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I know this is just an example, but I think we'd want this to be an assistant message also
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actually, I'm a bit unclear on this: I've been using system
for these kinds of messages where I "inject" important context or attempt to give the model a strong nudge to go in a different direction or update its instructions partway through a chat, and that seems to work well for me. (I also display all assistant
messages in my FE by default and wouldn't want to display this message, but I could find a way around that if needed.) Could I potentially get better results in such cases by using the assistant
role and changing up how I word the message content accordingly? I'm consistently getting the model to respond intelligently to inline web research using my current approach, but I'm curious and honestly I've probably neglected the assistant
role a bit when it comes to manually inserted messages!
I think this is an interesting idea, to build in the ability to incrementally add to the context window into the API. It's certainly possible to do this with the API alone today, but a bit more manual. I think we'd need to workshop the API design a bit, but you could see a feature like this being a nice DX optimization. To be transparent, it's unlikely we'll be able to analyze this in depth until later in the year, but happy to keep this conversation going. You could also build this as a standalone module on npm that uses the SDK as a dependency - might be an easier way to iterate on the DX independently? |
Yes, good call to iterate on this as a separate npm package in the interim! I can just pass in my instance of Also yes, totally possible by using the API directly, but I've become addicted to Anyhow, |
Changes being requested
I'd like to propose adding an
afterCompletion
hook as an option torunTools
. This enables building powerful multi-model and multi-agent functionality easily by providing a simple entrypoint at which you can analyzerunner.messages
and optionally carry out a separate LLM flow and then inject one or more messages (or modify messages) in order to, for example, inject some targeted web research to help your agent overcome a problem it's encountering.I've updated tests, added an example, and ensured everything is correctly typed.
My loom linked below shows how I'm using this to great effect in Otto Engineer, an autonomous coding system.
Additional context & links
Note that I tried to get this to work via tool definitions and prompting but got very poor results. I could almost never get the model to perform web research, and when I did, the prompt and tools had to be shaped so aggressively around this goal that it degraded performance of everything else. So I started thinking of ways to decompose this into separate problems and entirely remove web research from the prompt/tools so the model would not have to think about it and could focus purely on the already complex task of implementing and testing code changes.
I came up with an idea to periodically inspect the chat thread during a runTools call, and I initially attempted to do it via tricky Promise resolution combined with
runner.abort
+ starting a newrunTools
call and modifying messages, but it was painfully complex and near impossible to correctly update the chat messages without accidentally injecting my web research message in the wrong place (prior to tool call outputs). (Controlling promises via the emitted events is hazardous as, of course, they can happen out of sequence, and it's hard to make sure you've captured all in progress tool calls / messages before injecting a message.)Here's a demo loom where I show how I use this in my app.
Thank you
I absolutely love having static types around your API and all the goodness this library gives me! I get a TON of mileage out of
runTools
and want to contribute and help this library thrive! Thank you 🙏🏻