Download subtitles from YouTube as plain text.
pip install justsubs
justsubs gBnLl3QBOdM --list
justsubs gBnLl3QBOdM > sarno.txt
justsubs --help
- Decide what captions or subtitles are available for a video;
- Download VTT file;
- Extract text from VTT1.
pip install justsubs
Latest:
git clone https://github.com/epogrebnyak/justsubs.git
cd justsubs
pip install -e .
from justsubs import Video
video = Video("KzWS7gJX5Z8")
video.list_subs()
From the output above you may need a language identifier like
en-uYU-mmqFLq8
, default is en
.
subtitles = Video("KzWS7gJX5Z8").vtt(language="en-uYU-mmqFLq8")
subtitles.download()
print(subtitles.text()[:500])
from justsubs import get_text
text = get_text(video_id="KzWS7gJX5Z8", language="en-uYU-mmqFLq8")
print(text[:500])
- Popular package for Youtube subtitles is https://github.com/jdepoix/youtube-transcript-api.
- Whisper allows to do speech recognition locally.
Footnotes
-
VTT conversion based on gist by glasslion. ↩