Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[FR]: "Run/Evaluate - Compare - Edit Prompt" UI loop #871

Open
StoyanStAtanasov opened this issue Dec 11, 2024 · 3 comments
Open

[FR]: "Run/Evaluate - Compare - Edit Prompt" UI loop #871

StoyanStAtanasov opened this issue Dec 11, 2024 · 3 comments
Labels
enhancement New feature or request

Comments

@StoyanStAtanasov
Copy link

StoyanStAtanasov commented Dec 11, 2024

Proposal summary

A way to start a python script, that waits until the UI triggers an experiment.
Python Opik() client wait/callback function to start experiment from UI, after editing the prompt.
i.e client.wait_on(my_evalutation_fn) or Evaluation class with a run function.

Possible place for the ui in the experiments page. A run button and a change prompt button.

Motivation

The Run/Evaluate then Compare with previous runs, then change the prompt loop is the core of the process. There is no way to trigger this loop from the UI

@StoyanStAtanasov StoyanStAtanasov added the enhancement New feature or request label Dec 11, 2024
@jverre
Copy link
Collaborator

jverre commented Dec 11, 2024

We are actually planning on building functionality to run the entire evaluation through the UI which I think might be an even nicer experience.

However this to achieve this, we are planning on support prompt templates being evaluated against a dataset and scored using either one of Opik's metrics or a custom one. Would that work in your use case or does your evaluation task quite complex and not just a single LLM call ?

@StoyanStAtanasov
Copy link
Author

Your plan will work for a lot of cases and that's how we started but now we are looking into more complicated scenarios involving DB queries and multiple LLM calls. You could do both. My proposal is simple to implement. The code will wait and poll (HTTP) or wait for a message (Websocket) and then execute a callback.
Of course there are many ways to implement LLM application functionality, but it will take time, in the meantime you could just augment your API and have a UI trigger.

@jverre
Copy link
Collaborator

jverre commented Dec 12, 2024

Good point @StoyanStAtanasov

@alexkuzmik This could be a nice idea, what do you think ?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

2 participants