Skip to content

SidoShiro/Speech2TextCLI

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

15 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Transcribe Large Audiofile with Whisper

Setup

  1. Require Python 3.10 and FFMPEG installed
  2. Recommended (why to use venv) Create venv and enter in venv
    python -m venv venv
    
    # Linux
    . venv/bin/activate
    # Windows
    . venv/Script/activate
  3. Install requirements
    # Make sure the (venv) is flag in the terminal
    pip install -r requirements.txt
  4. Fix ffmpeg install if failures : Uninstall and reinstall ffmpeg-python
    pip uninstall ffmpeg
    pip uninstall ffmpeg-python
    
    pip install ffmpeg-python
    

Run

With the venv activated.

Usage:

python app.py stuff.mp3

Result will be stored in stuff_transcribe_result.txt

Run with model param

  • Model size for whisper are those : "tiny", "small", "base", "medium", "large"
  • The argument for the model is in 2nd
  • By default the large model will be used

Run on tiny model:

python app.py stuff.mp3 tiny

Run on base model:

python app.py stuff.py base

Logic

  • Moviepy will chunk the mp3 audio file into smaller chunk audio
  • The audio chunks are then stored in ./tmp_chunks_audio_speach2text/
  • For each chunk audio (in the right order), execute the whisper model (60 seconds maximum)
  • Append all text result
  • Write to file result stuff_transcribe_result.txt
  • Print result to stdout

Once the process is done, you can clean tmp_chunks_audio_speach2text/ folder

About

Using Whisper from openIA

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages