This repo is a fork from https://github.com/jik876/hifi-gan, with very few modifications:
- It keeps the old format of the aforementioned repo.
- It is adjusted to read given mel spectrograms.
To understand HiFi-GAN read the original paper.
- Python >= 3.6
- Clone this repository.
- Install python requirements. Please refer requirements.txt
python train.py \
--config configs/config_v1.json \
--input_wavs_dir dataset/ptBR/audio \
--input_mels_dir dataset/ptBR/mels \
--input_training_file dataset/train_files.txt \
--input_validation_file dataset/test_files.txt
To train V2 or V3 Generator, replace config_v1.json
with config_v2.json
or config_v3.json
.
Checkpoints and copy of the configuration file are saved in cp_hifigan
directory by default.
You can change the path by adding --checkpoint_path
option.
Here is a pretrained model trained on 20hs of a multispeaker ptBR dataset.
- Make
test_mel_dir
directory and copy generated mel-spectrogram files into the directory.
The spectrograms produced by this model are compatible with the pretrained checkpoint, for instance: Tacotron2. - Run the following command.
python inference_e2e_from_folder.py --input_mels [test_mel_dir] --output_dir --output_dir [output_wav] --checkpoint_file [generator checkpoint file path] --npyin True
Generated wav files are saved in output_wav
.
Many thanks to Jungil Kong, Jaehyeon Kim, Jaekyoung Bae for making the original repo available.