Real Time Whisper Transcription

This is a demo of real time speech to text with OpenAI's Whisper model. It works by constantly recording audio in a thread and concatenating the raw bytes over multiple recordings.

To install dependencies simply run

pip install -r requirements.txt

in an environment of your choosing.

Whisper also requires the command-line tool ffmpeg to be installed on your system, which is available from most package managers:

# on Ubuntu or Debian
sudo apt update && sudo apt install ffmpeg

# on Arch Linux
sudo pacman -S ffmpeg

# on MacOS using Homebrew (https://brew.sh/)
brew install ffmpeg

# on Windows using Chocolatey (https://chocolatey.org/)
choco install ffmpeg

# on Windows using Scoop (https://scoop.sh/)
scoop install ffmpeg

For more information on Whisper please see https://github.com/openai/whisper

This version I forked and worked on does automatic typing in a focused text box (browser etc) and it works decently at the given sample rate and buffer sizes. Please note that Whisper is not really built for this, but it performs well enough to be extremely useful.

The code in this repository is public domain.

Name		Name	Last commit message	Last commit date
Latest commit History 13 Commits
README.md		README.md
demo.gif		demo.gif
requirements.txt		requirements.txt
transcribe_demo.py		transcribe_demo.py
transcribe_demo2.py		transcribe_demo2.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Real Time Whisper Transcription

About

Releases

Packages

Languages

Markon101/whisper_real_time_testing

Folders and files

Latest commit

History

Repository files navigation

Real Time Whisper Transcription

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages