Hot fix for missed mp3 files issue.
Update faster-whisper to the latest version. Add models.py for model info.
- Fixed issue #60.
- Change default parameters for new faster-whisper.
- Update installation guide in README.
Code refactoring, documentation updates, and minor bug fixes.
- Fixed issue #54.
- Changed the default LLM model to
gpt-4o-mini
. - Minor improvements on prompts.
- Improve post-optimizations for subtitles.
Minor bugfixes and improvements.
- Fix default check_format returning False issue.
- Fix edges cases for srt file generation.
- Prepare translation evaluator for benchmarking.
This update add Gemini Models support for translation.
- Support Gemini as translation engine.
- Extract the
check_format
section into a standalonevalidators.py
module. - Add stop_sequences support for Chatbot.
- Use stop sequences when building Context.
- Also remove generated .wav files from videos if
clear_temp=True
. - Minor improvements on Context Reviewer prompt.
Minor bugfixes and improvements.
- Improve timestamp accuracy.
- Fix edges cases for transcription.
- Add glossary support for domain specific translation.
- Add
retry_model
args for retrying translation with different models. - Introduce
privider: model_name
for arbitrary model routing.
- Extend 0.5s for each suitable sentence.
- Reduce noise suppression chunk-size for faster processing.
- Enhance translation workflow by Context Reviewer Agent.
- Add custom endpoint (base_url) support for OpenAI & Anthropic.
- Generating bilingual subtitles.
- Fix dep issues from ctranslate2 and streamlit-related packages.
Add basic GUI support via streamlit.
- Add clear_temp_folder args.
This update add Claude Models support for translation.
This update improve the translation quality.
- Update faster-whisper version to 1.0.0.
- Set hallucination_silence_threshold to 2, which alleviates the hallucination issue.
- Add proxy argument.
This update fix minor issues.
- Remove water mark in srt and lrc.
- Improve logging.
This update add minor features.
- Binlingual subtitle support (Beta).
- Improve translation prompt.
This update fix minor issues.
- Fix issue that prevent the usage of whisper-large-v3.
- Remove tags from translation texts.
This update introduce new features.
- Resume from previous translation.
- Add atomic translation for src-trans inconsistency.
- Update default whisper model to
whisper-large-v3
.
This update improves preprocess efficiency and minor changes.
- Introduce multiprocessing for loudness normalization.
- Fix the .srt generation issue for video input.
- Add preprocess options, which users can tune.
This update addresses minor issues.
- Split audio during noise suppression to avoid out-of-memory.
- Improve translation prompt.
This update adds a preprocessor to enhance input audio (loudness normalization & noise suppression).
- Loudness Normalization from ffmpeg-normalize
- Noise Suppression from DeepFilterNet
- Now all the intermediate files are saved in
./path/to/audio/preprocess
.
This update switch the underlying transcription model from whisperx
to faster-whisper
, which enable VAD parameter
tuning.
- Switch whisperx back to faster-whisper for VAD parameter tuning.
- Update translation prompt from https://github.com/machinewrapped/gpt-subtrans/commit/82bd2ca0d868f209d0e0c5f7c04255523daabe3c.
- Change the default parameters of
faster-whisper
for consistent transcription.
Emergent bugfix release.
This update add input video support and introduce context configuration.
- Add
word_align
andsentence_split
for non-word-boundary languages to split long text into sentences. - Add text-normalization for help matching sentences.
- Add skip-translation support.
- Use
pathlib
to handle paths. - Improve timeline accuracy.
This update add input video support and introduce context configuration.
- Add input video support.
- Add context configuration for inputs.
- Add test suites to CI.
- Add language detection for translated content.
- Improve prompt by adding background info.
- Update punctuator model.
- Replace
opencc
with more light-weightzhconv
.
This update improves the timeline consistency of translated subtitles. Thanks gpt-subtrans!
- Fix misaligned timeline issue by improving translation prompt.
- Add output srt format support.
- Add changeable temperature and top_p parameter for GPTBot.
- Report total OpenAI translation fee for multiple audios.
- Improve repeat-checking algorithm.
This update enhances the efficiency of processing multiple audio files.
- Implementation of a producer-consumer model to process multiple audio files.
- Update logger with colored format.
- Minor parameter modification that makes the timeline of translation more intuitive.
This update significantly improves translation quality, but at the cost of slower translation speed.
- Use multi-step prompt for translation.
- Update the default model to
gpt-3.5-turbo-16k
. - Automatically fix json encoder error using GPT.
- Calculate the accurate price for OpenAI API requests.
This update greatly improves the quality of transcription (both in time-alignment and text-quality).
- Use
whisperx
to improve transcription accuracy. - Add Traditional Chinese to Mandarin optimization when
target_lang=zh-cn
.
- Update build tool to poetry.
- Use async call to communicate with OpenAI api.
- Abstract the GPT communication module as
GPTBot
. - Add fee limit for GPTBot.