This tool uses Mozilla DeepSpeech as an example to demonstrate the some generic functionalities / components of a speech-audio annotation tool, including annotation and model inference - training loop, to semi-automate the annotation process.
This repo contains the back-end component of the annotation system, the front-end component speech-audio-annotation-ui
is also required for the usage of the tool.
-
Clone the project git repo
-
Download and unzip the pre-trained DeepSpeech model to
<project_root_dir>/outputs
:
cd <project_root_dir>/outputs
curl -LO https://github.com/mozilla/DeepSpeech/releases/download/v0.6.1/deepspeech-0.6.1-models.tar.gz
tar xvf deepspeech-0.6.1-models.tar.gz
- Clone the DeepSpeech repo to
<project_root_dir>/models/deepspeech
(only required if model-training is needed):
git clone https://github.com/mozilla/DeepSpeech.git <project_root_dir>/models/deepspeech`
- Build and run
docker-compose
in<project_root_dir>
to bring up the containers:
docker-compose up --build
- Clone the UI project repo outside the back-end project repo:
git clone https://github.com/francesliang/speech-audio-annotation-ui
- Build and run
docker-compose
in<project_root_dir>
to bring up the containers:
docker-compose up --build
- The URL of the UI should be
localhost:3000