kaldi-gop

Computes GOP (Goodness of Pronunciation) and do forced alignment bases on Kaldi with nnet3 support. The acoustic model is trained using librispeech database (960 hours data) with the scripts under kaldi/egs/librispeech.

How to build

Download Kaldi.
Put the folders under src into kaldi/src.
Compile the code as compiling kaldi (kaldi/src/INSTALL):

Run the example

cd egs/gop-compute
./run.sh --dnn true/false audio_dir data_dir result_dir

See meaning of arguments in run.sh

Notes on data preparation

To use this tool, audio files (.wav) and corresponding transcript (.lab) needs to be prepared and stored in following format:

.
├── ...
├── data_dir                   
│   ├── speaker1 # indicate speaker ID          
│   ├── speaker2         
│   └── speaker3
|       ├── utt1.wav # indicate utterance ID
|       ├── utt1.lab 
└── ...

Do not use space in speaker folder name or utterance file name, using underscore instead. Make sure different speakers have different folder names (speaker ID) and different audio files have different file name (utt ID).

To-do

Add GPU support
Convert alignment results to readable format (textgrid)
Add comparison between GMM and DNN (nnet3)
Add feature extraction script

Name		Name	Last commit message	Last commit date
Latest commit History 67 Commits
egs/gop-compute		egs/gop-compute
src		src
LICENSE		LICENSE
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

kaldi-gop

How to build

Run the example

Notes on data preparation

To-do

About

Releases

Packages

Languages

License

chenchy/kaldi-dnn-ali-gop

Folders and files

Latest commit

History

Repository files navigation

kaldi-gop

How to build

Run the example

Notes on data preparation

To-do

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages