Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Idea: postprocess results with high compression ratio #411

Open
dgoryeo opened this issue Nov 2, 2024 · 2 comments
Open

Idea: postprocess results with high compression ratio #411

dgoryeo opened this issue Nov 2, 2024 · 2 comments

Comments

@dgoryeo
Copy link

dgoryeo commented Nov 2, 2024

Hi @jianfch ,

I came across this PR suggestion in the main Whisper community: Add compression_ratio_hallucination_threshold.
I thought may be stable-ts can achieve this as part of its enhancement/refinement? For example to redo those segments, or discard them if they're above the threshold.

Just a thought.

Thanks

@jianfch
Copy link
Owner

jianfch commented Nov 3, 2024

The redo is already handled by compression_ratio_threshold.

compression_ratio_threshold : float, default 2.4
If the gzip compression ratio is above this value, treat as failed.

if (
compression_ratio_threshold is not None
and decode_result.compression_ratio > compression_ratio_threshold
):
needs_fallback = True # too repetitive

Currently, there's no parameter that directly removes those segments but you can do something like this:

for seg in reversed(result):
    if seg.compression_ratio > 3.0:
        result.remove_segment(seg)

@jianfch
Copy link
Owner

jianfch commented Nov 27, 2024

After 08421e2 this can be implemented with one line or just an argument:

result.custom_operation('compression_ratio', '>', 3.0, 'remove', False)
# or
result.regroup('co=compression ratio+>+3.0+remove+0')

Those segments should be removed before other regroup methods move things around:

result = model.transcribe(..., regroup='co=compression ratio+>+3.0+remove+0')
# or to follow it up with default regrouping
result = model.transcribe(..., regroup='co=compression ratio+>+3.0+remove+0_da')

For CLI: --regroup "co=compression ratio+>+3.0+remove+0_da"

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants