WaveRNN

(Update: Vanilla Tacotron One TTS system just implemented - more coming soon!)

Pytorch implementation of Deepmind's WaveRNN model from Efficient Neural Audio Synthesis

Installation

Ensure you have:

Python >= 3.6
Pytorch 1 with CUDA

Then install the rest with pip:

pip install -r requirements.txt

How to Use

Quick Start

If you want to use TTS functionality immediately you can simply use:

python quick_start.py

This will generate everything in the default sentences.txt file and output to a new 'quick_start' folder where you can playback the wav files and take a look at the attention plots

You can also use that script to generate custom tts sentences and/or use '-u' to generate unbatched (better audio quality):

python quick_start.py -u --input_text "What will happen if I run this command?"

Training your own Models

Download the LJSpeech Dataset.

Edit hparams.py, point wav_path to your dataset and run:

python preprocess.py

or use preprocess.py --path to point directly to the dataset

Here's my recommendation on what order to run things:

1 - Train Tacotron with:

python train_tacotron.py

2 - You can leave that finish training or at any point you can use:

python train_tacotron.py --force_gta

this will force tactron to create a GTA dataset even if it hasn't finish training.

3 - Train WaveRNN with:

python train_wavernn.py --gta

NB: You can always just run train_wavernn.py without --gta if you're not interested in TTS.

4 - Generate Sentences with both models using:

python gen_tacotron.py wavernn

this will generate default sentences. If you want generate custom sentences you can use

python gen_tacotron.py --input_text "this is whatever you want it to be" wavernn

And finally, you can always use --help on any of those scripts to see what options are available :)

Samples

Can be found here.

Pretrained Models

Currently there are two pretrained models available in the /pretrained/ folder':

Both are trained on LJSpeech

WaveRNN (Mixture of Logistics output) trained to 800k steps
Tacotron trained to 180k steps

References

Acknowlegements

https://github.com/keithito/tacotron
https://github.com/r9y9/wavenet_vocoder
Special thanks to github users G-Wang, geneing & erogol

Name	Name	Last commit message	Last commit date
Latest commit fatchord Merge pull request #246 from fatchord/dependabot/pip/numpy-1.22.0 Jul 2, 2022 83c08fd · Jul 2, 2022 History 215 Commits
assets	assets	Add training animation	May 3, 2019
models	models	Fix crossfade misalignment	Nov 7, 2019
notebooks	notebooks	Add generate.py	Mar 24, 2019
pretrained	pretrained	Add better pretrained models	May 7, 2019
utils	utils	Generalise checkpointing for both model types	Sep 5, 2019
.gitattributes	.gitattributes	Change repo language	Mar 28, 2019
.gitignore	.gitignore	Re-Enabled multi-gpu training, buffers, grad clip in vocoder	Jul 25, 2019
LICENSE.txt	LICENSE.txt	Add License	Mar 2, 2019
README.md	README.md	[README] fix quote	Nov 23, 2019
gen_tacotron.py	gen_tacotron.py	Made vocoder and synth weights explicitly separated for inference	Aug 14, 2019
gen_wavernn.py	gen_wavernn.py	Allowed gen_wavernn.py to accept mel spectrogram .npy	Aug 22, 2019
hparams.py	hparams.py	Made r=2 in last session in tts hparams schedule	Aug 13, 2019
preprocess.py	preprocess.py	Added hparams pseudo-module to enable changing hparams at runtime	Jul 30, 2019
quick_start.py	quick_start.py	Fix batched generation in quick_start.py	Nov 7, 2019
requirements.txt	requirements.txt	Bump numpy from 1.16.2 to 1.22.0	Jun 21, 2022
sentences.txt	sentences.txt	Better Pretrained Models	Apr 8, 2019
train_tacotron.py	train_tacotron.py	Generalise checkpointing for both model types	Sep 5, 2019
train_wavernn.py	train_wavernn.py	Generalise checkpointing for both model types	Sep 5, 2019

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

WaveRNN

(Update: Vanilla Tacotron One TTS system just implemented - more coming soon!)

Installation

How to Use

Quick Start

Training your own Models

Samples

Pretrained Models

References

Acknowlegements

About

Releases

Packages

Contributors 5

Languages

License

fatchord/WaveRNN

Folders and files

Latest commit

History

Repository files navigation

WaveRNN

(Update: Vanilla Tacotron One TTS system just implemented - more coming soon!)

Installation

How to Use

Quick Start

Training your own Models

Samples

Pretrained Models

References

Acknowlegements

About

Topics

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Contributors 5

Languages

Packages