FastSpeech 2

Implementation of "FastSpeech 2: Fast and High-Quality End-to-End Text to Speech"

Quick Start

Prepare dataset

mkdir -p data/raw/
cd data/raw/
wget https://data.keithito.com/data/speech/LJSpeech-1.1.tar.bz2
tar -zxf LJSpeech-1.1.tar.bz2
cd ../../
python datasets/tts/lj/prepare.py

Forced alignment

# Download MFA first: https://montreal-forced-aligner.readthedocs.io/en/stable/aligning.html
# unzip to montreal-forced-aligner
./montreal-forced-aligner/bin/mfa_train_and_align data/raw/LJSpeech-1.1/mfa_input data/raw/LJSpeech-1.1/dict_mfa.txt data/raw/LJSpeech-1.1/mfa_outputs -t ./montreal-forced-aligner/tmp -j 24

Build binary data

# fs2
PYTHONPATH=. python datasets/tts/lj/gen_fs2.py --config configs/tts/lj/fs2.yaml
# fs2s
PYTHONPATH=. python datasets/tts/lj/gen_fs2s.py --config configs/tts/lj/fs2s.yaml

Train FastSpeech 2 and 2s

CUDA_VISIBLE_DEVICES=0 python tasks/fs2.py --config configs/tts/lj/fs2.yaml --exp_name fs2_exp1 --reset

CUDA_VISIBLE_DEVICES=0 python tasks/fs2s.py --config configs/tts/lj/fs2s.yaml --exp_name fs2s_exp1 --reset

Download pre-trained vocoder
```
mkdir wavegan_pretrained
```
download checkpoint-1000000steps.pkl, config.yml, stats.h5 from https://drive.google.com/open?id=1XRn3s_wzPF2fdfGshLwuvNHrbgD0hqVS to wavegan_pretrained/

Inference

CUDA_VISIBLE_DEVICES=0 python tasks/fs2.py --config configs/tts/lj/fs2.yaml --exp_name fs2_exp1 --infer
CUDA_VISIBLE_DEVICES=0 python tasks/fs2s.py --config configs/tts/lj/fs2s.yaml --exp_name fs2s_exp1 --infer

Name		Name	Last commit message	Last commit date
Latest commit History 50 Commits
amlt/latent		amlt/latent
configs		configs
datasets		datasets
evaluation		evaluation
modules		modules
music		music
parallel_wavegan		parallel_wavegan
preprocess		preprocess
shell_scripts		shell_scripts
speaker_identification		speaker_identification
tasks		tasks
usr_dir		usr_dir
usr_dir_kai		usr_dir_kai
utils		utils
zq_amlt		zq_amlt
.amltconfig		.amltconfig
.gitignore		.gitignore
README.md		README.md
data_gen_lj_discrete_vqemb.sh		data_gen_lj_discrete_vqemb.sh
debug.sh		debug.sh
dur.png		dur.png
dur_statics.py		dur_statics.py
durs_hist.png		durs_hist.png
environment.yaml		environment.yaml
gen_phoneset_from_dat.py		gen_phoneset_from_dat.py
gen_train.py		gen_train.py
gen_valid.py		gen_valid.py
infer.sh		infer.sh
infer_1000.sh		infer_1000.sh
infer_full.sh		infer_full.sh
infer_full_debug.sh		infer_full_debug.sh
infer_full_new_ref.sh		infer_full_new_ref.sh
infer_full_new_ref_pitch.sh		infer_full_new_ref_pitch.sh
infer_full_new_ref_pitch_randomclip.sh		infer_full_new_ref_pitch_randomclip.sh
infer_full_new_ref_sa.sh		infer_full_new_ref_sa.sh
infer_full_newref_newref_pitch_noprior.sh		infer_full_newref_newref_pitch_noprior.sh
infer_full_newref_pitch_debug_clip_noprior.sh		infer_full_newref_pitch_debug_clip_noprior.sh
infer_full_newref_sa_noprior.sh		infer_full_newref_sa_noprior.sh
infer_full_online_debug.sh		infer_full_online_debug.sh
infer_newrefenc.sh		infer_newrefenc.sh
install_env.sh		install_env.sh
phone_set.json		phone_set.json
phone_set_chanpin_librilight.json		phone_set_chanpin_librilight.json
phonememap.dat		phonememap.dat
requirements.txt		requirements.txt
spk_emb.png		spk_emb.png
spk_emb_statatics.py		spk_emb_statatics.py
temp_compute_f0.py		temp_compute_f0.py
test_lj_ft_discrete_full025_vqemb.sh		test_lj_ft_discrete_full025_vqemb.sh
visual_spk_emb.py		visual_spk_emb.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

FastSpeech 2

Quick Start

About

Releases

Packages

Languages

HeCheng0625/temp_2sn

Folders and files

Latest commit

History

Repository files navigation

FastSpeech 2

Quick Start

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages