Skip to content

Netflix-level subtitle cutting, translation, alignment, and even dubbing - one-click fully automated AI video subtitle team | Netflix级字幕切割、翻译、对齐、甚至加上配音,一键全自动视频搬运AI字幕组

License

Notifications You must be signed in to change notification settings

chaoqunxie/VideoLingo

 
 

Repository files navigation

VideoLingo Logo

Connect the World, Frame by Frame

Python License Open In Colab Discord GitHub stars

English中文 | 日本語

🌟 Overview

VideoLingo is an all-in-one video translation, localization, and dubbing tool aimed at generating Netflix-quality subtitles. It eliminates stiff machine translations and multi-line subtitles while adding high-quality dubbing, enabling global knowledge sharing across language barriers. With an intuitive Streamlit interface, you can transform a video link into a localized video with high-quality bilingual subtitles and dubbing in just a few clicks.

Key features:

  • 🎥 YouTube video download via yt-dlp

  • 🎙️ Word-level subtitle recognition with WhisperX

  • 📝 NLP and GPT-based subtitle segmentation

  • 📚 GPT-generated terminology for coherent translation

  • 🔄 2-step translation process rivaling professional quality

  • ✅ Netflix-standard single-line subtitles only

  • 🗣️ Dubbing alignment (e.g., GPT-SoVITS)

  • 🚀 One-click startup and output in Streamlit

  • 📝 Detailed logging with progress resumption

  • 🌐 Comprehensive multi-language support

Difference from similar projects: Single-line subtitles only, superior translation quality

🎥 Demo

Russian Translation


ru_demo.mp4

GPT-SoVITS


sovits.mp4

OAITTS


OAITTS.mp4

Language Support:

Current input language support and examples:

Input Language Support Level Translation Demo
English 🤩 English to Chinese
Russian 😊 Russian to Chinese
French 🤩 French to Japanese
German 🤩 German to Chinese
Italian 🤩 Italian to Chinese
Spanish 🤩 Spanish to Chinese
Japanese 😐 Japanese to Chinese
Chinese* 🤩 Chinese to English

*Chinese requires separate configuration of the whisperX model, only applicable for local source code installation. See the installation documentation for the configuration process, and be sure to specify the transcription language as zh in the webpage sidebar

Translation language support depends on the capabilities of the large language model used, while dubbing language depends on the chosen TTS method.

🚀 Quick Start

Online Experience

Experience VideoLingo quickly in Colab in just 5 minutes:

Open In Colab

Local Installation

VideoLingo offers two local installation methods: One-click Simple Package and Source Code Installation. Please refer to the installation documentation: English | 简体中文

Docker Installation

VideoLingo provides a Dockerfile for Docker installation. Please refer to the installation documentation: English | 简体中文

🏭 Batch Mode

Usage instructions: English | 简体中文

⚠️ Current Limitations

  1. UVR5 voice separation is resource-intensive and processes slowly. It's recommended to use this feature only on devices with more than 16GB of RAM and 8GB of VRAM. Note: For videos with loud BGM, not performing voice separation before whisper may cause word-level subtitle adhesion, resulting in errors in the final alignment step.

  2. The quality of dubbing may not be perfect due to differences in language structure and morpheme information density between source and target languages. For best results, choose TTS with similar speech rates based on the original video's speed and content characteristics. The best practice is to train the original video's voice using GPT-SoVITS, then use "Mode 3: Use every reference audio" for dubbing. This ensures maximum consistency in voice, speech rate, and tone. See the demo for effects.

  3. Multilingual video transcription recognition will only retain the main language. This is because whisperX uses a specialized model for a single language when forcibly aligning word-level subtitles, deleting unrecognized languages.

  4. Multi-character separate dubbing is currently unavailable. While whisperX has VAD potential, specific development is needed, and this feature is not yet implemented.

🚗 Roadmap

  • VAD to distinguish speakers, multi-character dubbing
  • Customizable translation styles
  • User terminology glossary
  • Provide commercial services
  • Lip sync for dubbed videos

📄 License

This project is licensed under the Apache 2.0 License. When using this project, please follow these rules:

  1. When publishing works, it is recommended (not mandatory) to credit VideoLingo for subtitle generation.
  2. Follow the terms of the large language models and TTS used for proper attribution.
  3. If you copy the code, please include the full copy of the Apache 2.0 License.

We sincerely thank the following open-source projects for their contributions, which provided important support for the development of VideoLingo:

📬 Contact Us

⭐ Star History

Star History Chart


If you find VideoLingo helpful, please give us a ⭐️!

About

Netflix-level subtitle cutting, translation, alignment, and even dubbing - one-click fully automated AI video subtitle team | Netflix级字幕切割、翻译、对齐、甚至加上配音,一键全自动视频搬运AI字幕组

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • Python 69.2%
  • Jupyter Notebook 29.6%
  • Other 1.2%