Using OpenAI's Whisper to generate transcripts of twitter embedded videos. This will also work for YouTube videos as this is built with youtube-dl
It makes the most sense to run this using Google's Colab, though you can download it and run it locally if you really want to. You don't need an Nvidia card to query the model using your GPU, but using your CPU will be very slow. Highly recommmend using Colab unless you have CUDA installed and have an Nvidia GPU.