Speech Model Pre-training for End-to-End Spoken Language Understanding

This repo contains the code for the paper "Speech Model Pre-training for End-to-End Spoken Language Understanding".

If you have any questions about this code or have problems getting it to work, please send me an email at <the email address listed for me in the paper>.

Dependencies

PyTorch, numpy, soundfile, pandas, tqdm, textgrid.py

Usage

First, change the asr_path and/or slu_path in the config file (like experiments/no_unfreezing.cfg, or whichever experiment you want to run) to point to where the LibriSpeech data and/or Fluent Speech Commands data are stored on your computer.

SLU training: To train the model on Fluent Speech Commands, run the following command:

python main.py --train --config_path=<path to .cfg>

ASR pre-training: Note: the experiment folders in this repo already have a pre-trained LibriSpeech model that you can use. LibriSpeech is pretty big (>100 GB uncompressed), so don't do this part unless you want to re-run the pre-training part with different hyperparameters. If you want to do this, you will first need to download our LibriSpeech alignments here, put them in a folder called "text", and put the LibriSpeech audio in a folder called "audio". To pre-train the model on LibriSpeech, run the following command:

python main.py --pretrain --config_path=<path to .cfg>

Citation

If you find this repo or our Fluent Speech Commands dataset useful, please cite our paper:

Loren Lugosch, Mirco Ravanelli, Patrick Ignoto, Vikrant Singh Tomar, and Yoshua Bengio, "Speech Model Pre-training for End-to-End Spoken Language Understanding", arXiv, 2019.

Name		Name	Last commit message	Last commit date
Latest commit History 147 Commits
experiments		experiments
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
data.py		data.py
main.py		main.py
models.py		models.py
training.py		training.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Speech Model Pre-training for End-to-End Spoken Language Understanding

Dependencies

Usage

Citation

About

Releases

Packages

Languages

License

mravanelli/pretrain_speech_model

Folders and files

Latest commit

History

Repository files navigation

Speech Model Pre-training for End-to-End Spoken Language Understanding

Dependencies

Usage

Citation

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages