You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
model = stable_whisper.load_model('small')
result = model.transcribe(file)
result.to_srt_vtt('audio.vtt', False, True)
for caption in webvtt.read('audio.vtt'):
print(caption.start +" "+caption.text+" "+caption.end)
With the code above, during the transcription, it would skip different parts of the audio for different files uploaded. For example, it jumps from 00:00:30.920 yourself 00:00:32.440 to 00:01:00.000 too 00:01:00.200. Is there any way to fix it?
The text was updated successfully, but these errors were encountered:
Try to use a higher value for no_speech_threshold (default: 0.6). Or set it to None to disable all skipping triggered to this threshold (do this only when there is not non speech gaps longer than 30 seconds in the audio or it will hallucinate for that gap).
Hi, really do appreciate your feedback; however, it still does not work even when I set no_speech_threshold to none. For the other song that I'm working on, it skips from 00:00:01.740 people 00:00:02.160 to 00:00:31.000 Sometimes 00:00:31.500 when there's around 10 seconds of pure music and 20 seconds of music + vocal. Is there a way to work on that?
model = stable_whisper.load_model('small')
result = model.transcribe(file)
result.to_srt_vtt('audio.vtt', False, True)
for caption in webvtt.read('audio.vtt'):
print(caption.start +" "+caption.text+" "+caption.end)
With the code above, during the transcription, it would skip different parts of the audio for different files uploaded. For example, it jumps from 00:00:30.920 yourself 00:00:32.440 to 00:01:00.000 too 00:01:00.200. Is there any way to fix it?
The text was updated successfully, but these errors were encountered: