[FR]: "Run/Evaluate - Compare - Edit Prompt" UI loop #871

StoyanStAtanasov · 2024-12-11T18:50:57Z

Proposal summary

A way to start a python script, that waits until the UI triggers an experiment.
Python Opik() client wait/callback function to start experiment from UI, after editing the prompt.
i.e client.wait_on(my_evalutation_fn) or Evaluation class with a run function.

Possible place for the ui in the experiments page. A run button and a change prompt button.

Motivation

The Run/Evaluate then Compare with previous runs, then change the prompt loop is the core of the process. There is no way to trigger this loop from the UI

jverre · 2024-12-11T19:35:54Z

We are actually planning on building functionality to run the entire evaluation through the UI which I think might be an even nicer experience.

However this to achieve this, we are planning on support prompt templates being evaluated against a dataset and scored using either one of Opik's metrics or a custom one. Would that work in your use case or does your evaluation task quite complex and not just a single LLM call ?

StoyanStAtanasov · 2024-12-12T12:25:05Z

Your plan will work for a lot of cases and that's how we started but now we are looking into more complicated scenarios involving DB queries and multiple LLM calls. You could do both. My proposal is simple to implement. The code will wait and poll (HTTP) or wait for a message (Websocket) and then execute a callback.
Of course there are many ways to implement LLM application functionality, but it will take time, in the meantime you could just augment your API and have a UI trigger.

jverre · 2024-12-12T20:10:46Z

Good point @StoyanStAtanasov

@alexkuzmik This could be a nice idea, what do you think ?

StoyanStAtanasov added the enhancement New feature or request label Dec 11, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[FR]: "Run/Evaluate - Compare - Edit Prompt" UI loop #871

[FR]: "Run/Evaluate - Compare - Edit Prompt" UI loop #871

StoyanStAtanasov commented Dec 11, 2024 •

edited

Loading

jverre commented Dec 11, 2024

StoyanStAtanasov commented Dec 12, 2024

jverre commented Dec 12, 2024

[FR]: "Run/Evaluate - Compare - Edit Prompt" UI loop #871

[FR]: "Run/Evaluate - Compare - Edit Prompt" UI loop #871

Comments

StoyanStAtanasov commented Dec 11, 2024 • edited Loading

Proposal summary

Motivation

jverre commented Dec 11, 2024

StoyanStAtanasov commented Dec 12, 2024

jverre commented Dec 12, 2024

StoyanStAtanasov commented Dec 11, 2024 •

edited

Loading