[FR]: Enable Opik to display additional media formats, including audio, PDF, and video. #567

pleomax0730 · 2024-11-06T01:11:53Z

Proposal summary

Feature Request

Enable Opik to display additional media formats, including audio, PDF, and video.

Background

Opik currently supports only image display, which limits its flexibility for monitoring, testing, and evaluating multimodal LLM applications that may involve other data formats. Expanding support for audio, PDF, and video would allow users to fully leverage Opik’s capabilities across a broader range of use cases.

Proposed Use Cases:

Audio Analysis: Track and evaluate audio-based LLM applications, such as voice transcription, voicebots, and sentiment analysis.
PDF Document Evaluation: Facilitate assessment of document parsing models, PDF summarization, and question-answering systems on document data.
Video Content Monitoring: Enhance capabilities for video-based LLM applications, such as video content analysis, summarization, and media captioning.

Benefits:

Enhanced Multimodal Support: Broadens Opik’s applicability to multimodal applications, aligning with the needs of teams working across diverse LLM applications.
Improved Traceability: Extends Opik’s tracing and feedback logging for all media types, ensuring comprehensive monitoring for complex, cross-modal projects.
Unified Platform: Allows users to manage all media formats under a single platform, streamlining workflows.

Summary

Adding support for audio, PDF, and video display will make Opik a more versatile platform, suitable for a wide range of LLM applications beyond text and images. This enhancement will empower users to develop, evaluate, and monitor their applications seamlessly across all media types.

Motivation

Many existing solutions for LLM evaluation and monitoring are limited to text and image formats, with little to no support for other media types like audio, PDFs, or video. This lack of multimodal support forces teams to use multiple tools or rely on custom workarounds, creating friction in their workflows and hindering a comprehensive evaluation process.

By introducing audio, PDF, and video support, Opik could become the first open-source platform to offer complete multimodal monitoring and evaluation capabilities. This would make Opik highly attractive to teams working on complex applications that require seamless integration of various data formats, such as multimedia retrieval, interactive voice systems, and document processing pipelines.

Competitive Advantage:

Leading the way with these features would position Opik as a go-to solution for multimodal LLM applications, setting it apart from other evaluation and monitoring tools. This could significantly increase Opik’s user base by attracting organizations and researchers who need comprehensive, media-agnostic monitoring for their LLM projects.

jverre · 2024-11-06T10:52:05Z

Hi @pleomax0730
Thanks for the detailed request ! I really like this idea, we could introduce the concept of an attachment for a trace or span that allows you to log any additional data. Depending on the data type, we could introduce some ways to view the data in the UI.

Is there a specific type of data you would like to us to support first ? We have gotten requests for better PDF support so could be a good candidate to start with

pleomax0730 · 2024-11-06T12:10:07Z

@jverre Thanks for your reply to my feature request! In terms of the current community, I think the best order might be PDF, then audio, and finally video.

jverre · 2024-11-07T11:46:45Z

Makes sense, I'll keep this ticket open. It's quite a big feature so we might not get to it straightaway

AHB102 · 2024-12-01T13:29:37Z

@jverre @pleomax0730 I'd like to contribute to the Documents part of this issue. Could you please provide some guidance on the specific tasks or features that need to be implemented? Would it involve tasks like document summarization or extraction of key information?

pleomax0730 · 2024-12-01T14:11:06Z

@jverre @pleomax0730 I'd like to contribute to the Documents part of this issue. Could you please provide some guidance on the specific tasks or features that need to be implemented? Would it involve tasks like document summarization or extraction of key information?

Hi, @AHB102 langfuse track audio might be a good reference or idea of implementing this feature. No summarization or extraction is needed. Only display the media or data as the reference to this track.

The audio preview in GENERATION section.

jverre · 2024-12-04T11:19:19Z

Hi @pleomax0730 @AHB102
We are starting to think about this feature and I went ahead and created a short document with how it would work in the SDK and in the FE: https://cometml.notion.site/Add-support-for-attachments-1527124010a38025a600cb7ea20ecacf

It's a pretty big initiative so feel free to add comments here in case we have missed anything that is relevant

AHB102 · 2024-12-06T06:02:25Z

@jverre Hello 🖐️, I read the doc and think starting with the SDK changes for docs sounds like a good plan. However, I’m still relatively new to software development, so I’m a bit lost in the codebase. Any specific docs or resources you’d recommend to help me get started?

jverre · 2024-12-09T09:46:56Z

@AHB102 This is a pretty big feature that touches the Python SDK, the backend and the UI, might be a bit tricky as a new issue

I'll create a couple of issues later today and will tag them as good first issue that will be a bit smaller in scope, I recommend tackling one of these

AHB102 · 2024-12-09T11:29:15Z

@jverre Yeah agreed, I'll definitely look into good first issues.

pleomax0730 added the enhancement New feature or request label Nov 6, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[FR]: Enable Opik to display additional media formats, including audio, PDF, and video. #567

[FR]: Enable Opik to display additional media formats, including audio, PDF, and video. #567

pleomax0730 commented Nov 6, 2024 •

edited

Loading

jverre commented Nov 6, 2024

pleomax0730 commented Nov 6, 2024

jverre commented Nov 7, 2024

AHB102 commented Dec 1, 2024

pleomax0730 commented Dec 1, 2024 •

edited

Loading

jverre commented Dec 4, 2024

AHB102 commented Dec 6, 2024

jverre commented Dec 9, 2024

AHB102 commented Dec 9, 2024

[FR]: Enable Opik to display additional media formats, including audio, PDF, and video. #567

[FR]: Enable Opik to display additional media formats, including audio, PDF, and video. #567

Comments

pleomax0730 commented Nov 6, 2024 • edited Loading

Proposal summary

Feature Request

Background

Proposed Use Cases:

Benefits:

Summary

Motivation

Competitive Advantage:

jverre commented Nov 6, 2024

pleomax0730 commented Nov 6, 2024

jverre commented Nov 7, 2024

AHB102 commented Dec 1, 2024

pleomax0730 commented Dec 1, 2024 • edited Loading

jverre commented Dec 4, 2024

AHB102 commented Dec 6, 2024

jverre commented Dec 9, 2024

AHB102 commented Dec 9, 2024

pleomax0730 commented Nov 6, 2024 •

edited

Loading

pleomax0730 commented Dec 1, 2024 •

edited

Loading