OpenSeq2Seq: toolkit for distributed and mixed precision training of sequence-to-sequence models

OpenSeq2Seq main goal is to allow researchers to most effectively explore various sequence-to-sequence models. The efficiency is achieved by fully supporting distributed and mixed-precision training. OpenSeq2Seq is built using TensorFlow and provides all the necessary building blocks for training encoder-decoder models for neural machine translation, automatic speech recognition, speech synthesis, and language modeling.

Documentation and installation instructions

https://nvidia.github.io/OpenSeq2Seq/

Features

Models for:
1. Neural Machine Translation
2. Automatic Speech Recognition
3. Speech Synthesis
4. Language Modeling
5. NLP tasks (sentiment analysis)
Data-parallel distributed training
1. Multi-GPU
2. Multi-node
Mixed precision training for NVIDIA Volta/Turing GPUs

Software Requirements

Python >= 3.5
TensorFlow >= 1.10
CUDA >= 9.0, cuDNN >= 7.0
Horovod >= 0.13 (using Horovod is not required, but is highly recommended for multi-GPU setup)

Acknowledgments

Speech-to-text workflow uses some parts of Mozilla DeepSpeech project.

Text-to-text workflow uses some functions from Tensor2Tensor and Neural Machine Translation (seq2seq) Tutorial.

Disclaimer

This is a research project, not an official NVIDIA product.

Related resources

Paper

If you use OpenSeq2Seq, please cite this paper

@misc{openseq2seq,
    title={Mixed-Precision Training for NLP and Speech Recognition with OpenSeq2Seq},
    author={Oleksii Kuchaiev and Boris Ginsburg and Igor Gitman and Vitaly Lavrukhin and Jason Li and Huyen Nguyen and Carl Case and Paulius Micikevicius},
    year={2018},
    eprint={1805.10387},
    archivePrefix={arXiv},
    primaryClass={cs.CL}
}

Name	Name	Last commit message	Last commit date
Latest commit vsl9 Updated readme for neural LM rescoring Apr 17, 2019 f3624dc · Apr 17, 2019 History 1,607 Commits
calibration	calibration	pickle dump fix (NVIDIA#372 )	Mar 25, 2019
ctc_decoder_with_lm	ctc_decoder_with_lm	Custom CTC decoder tests (NVIDIA#294 )	Nov 29, 2018
docker	docker	Release candidate of OpenSeq2Seq v0.2	Apr 13, 2018
docs	docs	Updated Transformer-big checkpoint	Apr 10, 2019
example_configs	example_configs	Update jasper10x5_LibriSpeech_nvgrad.py	Apr 11, 2019
external_lm_rescore	external_lm_rescore	Updated readme for neural LM rescoring	Apr 17, 2019
open_seq2seq	open_seq2seq	Updated speech2text data layer for interactive inference	Apr 15, 2019
scripts	scripts	Merge branch 'custom_decoders'	Apr 9, 2019
.gitignore	.gitignore	add caching of feature preprocessing	Dec 10, 2018
.pylintrc	.pylintrc	Final Tacotron Update (NVIDIA#237 )	Sep 23, 2018
.style.yapf	.style.yapf	Code style fixes (NVIDIA#174 )	Jul 3, 2018
AUTHORS	AUTHORS	Addressing review comments	Apr 18, 2018
CONTRIBUTING.md	CONTRIBUTING.md	Switch license type to Apache 2.0 to make external contributions easier	Oct 2, 2018
Interactive_Infer_example.ipynb	Interactive_Infer_example.ipynb	Final Tacotron Update (NVIDIA#237 )	Sep 23, 2018
LICENSE	LICENSE	Switch license type to Apache 2.0 to make external contributions easier	Oct 2, 2018
README.md	README.md	Update README.md	Dec 11, 2018
requirements.txt	requirements.txt	librosa-python_speech_features (NVIDIA#399 )	Apr 9, 2019
run.py	run.py	Add custom hooks to training loop.	Apr 1, 2019
tokenizer_wrapper.py	tokenizer_wrapper.py	added get_en_es_big.sh and fix in tokenizer_wrapper.py	Jan 30, 2019

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

OpenSeq2Seq: toolkit for distributed and mixed precision training of sequence-to-sequence models

Documentation and installation instructions

Features

Software Requirements

Acknowledgments

Disclaimer

Related resources

Paper

About

Releases

Packages

Languages

License

borisgin/OpenSeq2Seq

Folders and files

Latest commit

History

Repository files navigation

OpenSeq2Seq: toolkit for distributed and mixed precision training of sequence-to-sequence models

Documentation and installation instructions

Features

Software Requirements

Acknowledgments

Disclaimer

Related resources

Paper

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages