How accurate are our subtitle timestamps as compared to whisperX and Whisper timestamped? #376
Replies: 2 comments
-
The approach they are referring to was used in version 1.X . Stable-ts has switched to Dynamic Time Warping since 2.X (not long after Whisper introduced it) which is more reliable and accurate than the old technique. I don't how Stable-ts compares to WhisperX, but it generally produces better results than default Whisper because it is essentially Whisper with additional preprocessing and postprocessing aimed to produce better results. Ideally, you would benchmark them to quantify how much more accurate or if one is actually more accurate than the other. |
Beta Was this translation helpful? Give feedback.
-
Just from eye-balling it, WhisperX seemed to produce a lot better results than Stable-ts. |
Beta Was this translation helpful? Give feedback.
-
I need really high accuracy in subtitles. I tried using montreal forced aligner but it was taking way too much time. I really liked whisperX but its lack of capacity in handling numerals is disappointing. However i actually really liked it's timestamp accuracy.
Whisper-timestamped notes mention that you use an inaccurate method to extract timestamps. How about using some phenome based aligners? Or montreal itself? Timestamp accuracy is really needed.
Beta Was this translation helpful? Give feedback.
All reactions