-
Notifications
You must be signed in to change notification settings - Fork 180
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Any options to prevent "flickering" #355
Comments
There are no options that directly extends timestamp, but you can prevent "flickering" with regrouping methods such as However, if you want to preserve the segments as they are and only change the timestamps, you'll need something like this: import stable_whisper
model = stable_whisper.load_model('base')
result = model.transcribe('audio.mp3')
for i, segment in enumerate(result):
if i+1 == len(result):
break
next_start = result[i+1].start
if next_start - segment.end <= 0.100:
segment.end = next_start
Likewise, there are no options for this because all the options in stable-ts are centered around squeezing the timestamps to its word. So something like the script above should do. |
@jianfch Thanks a lot for the answer and an example!
Although I think such functionality would really benefit to have in stable-ts itself, as it's needed for the consumer. This library and my script helped me a lot to get a good subtitle draft (much better than a YouTube auto-subs). Unfortunately, I still had to manually adjust many timings. Can you suggest the best settings (I guess, |
Note that |
Let's say one segment ends at 4.7s and another one starts at 4.8s. This causes the first segment to disable a bit before the next one is enabled, thus resulting in no subtitle being shown for a brief moment - "flickering". I would prefer (in case the subsequent segments are "close enough") to hold the previous segment a bit longer and/or maybe start the following segment a bit earlier, so one subtitle is swapped directly into the following one. Is it possible?
Related question: is it possible to begin all segments (which do have some space before/after them) a bit earlier and to end them a bit later, thus making the timings less exact, but allowing the viewer more reading time?
P.S. Thanks for the library! It works great
The text was updated successfully, but these errors were encountered: