Name		Name	Last commit message	Last commit date
Latest commit History 42 Commits
.github/workflows		.github/workflows
.vscode		.vscode
whisperX @ adf455a		whisperX @ adf455a
.dockerignore		.dockerignore
.gitattributes		.gitattributes
.gitignore		.gitignore
.gitmodules		.gitmodules
Dockerfile		Dockerfile
Dockerfile.no_model		Dockerfile.no_model
LICENSE		LICENSE
README.md		README.md
docker-bake.hcl		docker-bake.hcl
load_align_model.py		load_align_model.py

Repository files navigation

docker-whisperX

This is the Docker image for WhisperX: Automatic Speech Recognition with Word-Level Timestamps (and Speaker Diarization)

Get the Dockerfile at GitHub, or pull the image from ghcr.io.

Available Image Tags

Warning

Due to the excessively large file sizes (40GB+), continuous integration cannot be set up for these images. As a result, they will not update automatically.
Please build them manually if they are outdated.

The image tags are formatted as WHISPER_MODEL-LANG, for example, tiny-en, base-de, or large-v2-zh.
Please note that I does not uploaded all the combinations.

You can find all available tags at ghcr.io.

In addition, there is also a no_model tag that does not include any pre-downloaded models, also referred to as latest.

Building the Docker Image

Important

Clone the Git repository recursively to include submodules:
git clone --recursive https://github.com/jim60105/docker-whisperX.git

Build Arguments

The Dockerfile builds the image contained models. It accepts two build arguments: LANG and WHISPER_MODEL.

LANG: The language to transcribe. The default is en. See here for supported languages.
WHISPER_MODEL: The model name. The default is base. See fast-whisper for supported models.

Build Command

For example, if you want to build the image with ja language and large-v2 model:

docker build --build-arg LANG=ja --build-arg WHISPER_MODEL=large-v2 -t whisperx:large-v2-ja .

Usage Command

Mount the current directory as /app and run WhisperX with additional input arguments:

docker run --gpus all -it -v ".:/app" whisperx:large-v2-ja -- --output_format srt audio.mp3

Note

Remember to prepend -- before the arguments.
--model and --language args are defined in Dockerfile, no need to specify.

LICENSE

The main program, WhisperX, is distributed under the BSD-4 license.
Please refer to the git submodules for their respective source code licenses.

The Dockerfile from this repository is licensed under MIT.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

docker-whisperX

Available Image Tags

Building the Docker Image

Build Arguments

Build Command

Usage Command

LICENSE

About

Packages

Contributors 6

Languages

License

jim60105/docker-whisperX

Folders and files

Latest commit

History

Repository files navigation

docker-whisperX

Available Image Tags

Building the Docker Image

Build Arguments

Build Command

Usage Command

LICENSE

About

Topics

Resources

License

Stars

Watchers

Forks

Packages 0

Contributors 6

Languages

Packages