- Use OpenAi Whisper to generate subtitles for videos, and use deepL to translate them.
- Use Silero VAD to detect voice activity to accelerate the transcribe process and avoid Whisper hallucination
- Supported languages for speech transcribe in video: https://github.com/openai/whisper/blob/main/whisper/tokenizer.py#L113
- Supported languages for translation: https://www.deepl.com/docs-api/translating-text/request/
git clone https://github.com/rufuszhu/WhisperSRT.git
cd WhisperSRT
pip install .
Install ffmpeg
# on Ubuntu or Debian
sudo apt update && sudo apt install ffmpeg
# on Arch Linux
sudo pacman -S ffmpeg
# on MacOS using Homebrew (https://brew.sh/)
brew install ffmpeg
# on Windows using Scoop (https://scoop.sh/)
scoop install ffmpeg
pip uninstall torch
pip cache purge
pip3 install torch torchvision torchaudio --extra-index-url https://download.pytorch.org/whl/cu117
python -c "import torch; print(torch.cuda.is_available())"
https://www.deepl.com/pro#developer (It's free)
Save your api key in your environment variable as DEEPL_API_KEY
The following command will generate chinese subtitles for the video file video.mp4
containing korean speech, and save the result to video.srt
:
whisperSrt path/to/video.mp4 -t --lang=ko --dest=zh
The same command also works for folder, it will generate subtitles for all video files in the folder, including subfolders:
whisperSrt path/to/folder -t --lang=ko --dest=zh
The following command will translate an existing chinese srt file into an english srt file:
whisperSrt path/to/video.srt -tr --lang=zh --dest=en
Use whisperSrt -h
to see all available options.
- Use chatGPT to translate
- Add UI for MacOS