Pulse · m-bain/whisperX · GitHub

February 17, 2025 – February 24, 2025

Overview

2 Active pull requests

29 Active issues

1 Pull request merged by 1 person

Added Phoneme-Based ASR Model for Tagalog
#1067 merged Feb 23, 2025

1 Pull request opened by 1 person

Add jr, sr, and ph.d to punkt abbreviations
#1053 opened Feb 18, 2025

26 Issues closed by 1 person

RuntimeError: Model has been downloaded but the SHA256 checksum does not not match. Please retry loading the model
#689 closed Feb 19, 2025
Model has been downloaded but the SHA256 checksum does not not match
#389 closed Feb 19, 2025
New English ASR model Distil-Whisper
#550 closed Feb 19, 2025
Support for stdin and and stdout
#403 closed Feb 19, 2025
simple split audio file example using whisperx
#437 closed Feb 19, 2025
Is it possible to use insanely-fast-whisper instead of faster-whisper?
#614 closed Feb 19, 2025
Distil-Whisper support?
#558 closed Feb 19, 2025
How do I create a huggingface space of the current version of whisperx?
#631 closed Feb 19, 2025
Where are the example/sample wav files?
#325 closed Feb 19, 2025
Always crashes at alignment stage
#198 closed Feb 19, 2025
Hallucination causes failure to align - uncleaned input in whisper dataset
#230 closed Feb 19, 2025
Linux install issues
#242 closed Feb 19, 2025
Add Arabic models
#247 closed Feb 19, 2025
Somewhat unclear instructions (Readme.md) regarding alignment model size
#256 closed Feb 19, 2025
Alignment Issue?
#184 closed Feb 19, 2025
huggingface error
#261 closed Feb 19, 2025
Sentence based timestamps
#142 closed Feb 19, 2025
out of sync for part of the file
#151 closed Feb 19, 2025
FileNotFoundError
#925 closed Feb 19, 2025
WhisperX can Generate the N-best (top few) hypotheses?
#921 closed Feb 19, 2025
Using Whisperx using Nvidia Triton
#938 closed Feb 19, 2025
Are there any Fork Projects currently in progress?
#934 closed Feb 19, 2025
Question about Min-Cut Operation and Silero Model Handling of Long Segments
#1015 closed Feb 19, 2025
Why perform speaker diarization at the end
#1043 closed Feb 19, 2025
whisperX vs fasterWhisper
#1037 closed Feb 19, 2025
Can't transcribe the mp3 or wav audio in the format given.
#1052 closed Feb 19, 2025

3 Issues opened by 3 people

Feature Request: Add token-based authentication support for downloading the base model in whisperx.load_model
#1068 opened Feb 24, 2025
Comparing WhisperX and Faster-Whisper on RunPod: Speed, Accuracy, and Optimization
#1066 opened Feb 22, 2025
Real Time Diarization for Streaming Audio Chunks in Custom ASR Pipeline
#1065 opened Feb 21, 2025

12 Unresolved conversations

Sometimes conversations happen on old items that aren’t yet closed. Here is a list of all the Issues and Pull Requests with unresolved conversations.

make sure the leading and tailing word boundary exists.
#1019 commented on Feb 21, 2025 • 4 new comments
pip install whisperx results in installation of torch >2.0.0
#1051 commented on Feb 18, 2025 • 0 new comments
New warnings in v3.3.1 acceptable?
#998 commented on Feb 19, 2025 • 0 new comments
Feature Request: Whisper Tensorrt-llm backend support
#624 commented on Feb 19, 2025 • 0 new comments
tensors used as indices must be long, int, byte or bool tensors
#1048 commented on Feb 19, 2025 • 0 new comments
cpu utilisation maxes at 50% (conda?)
#890 commented on Feb 20, 2025 • 0 new comments
OOM with --diarize on long audio (4 hours)
#951 commented on Feb 22, 2025 • 0 new comments
Can't get cuda to work.
#983 commented on Feb 22, 2025 • 0 new comments
Diarization too slow
#274 commented on Feb 22, 2025 • 0 new comments
Could not load library libcudnn_ops_infer.so.8. on newest version.
#967 commented on Feb 24, 2025 • 0 new comments
Force Alignment with original text
#1009 commented on Feb 24, 2025 • 0 new comments
Speaker embeddings
#952 commented on Feb 19, 2025 • 0 new comments