Skip to content

VITS implementation of Japanese, Chinese, Korean, Sanskrit and Thai.. to be converted to Sinhala

License

Notifications You must be signed in to change notification settings

ov1n/vits_voiceCloning

 
 

Repository files navigation

How to use

(Suggestion) Python == 3.7

Clone this repository

git clone https://github.com/CjangCjengh/vits.git

Choose cleaners

  • Fill "text_cleaners" in config.json
  • Edit text/symbols.py
  • Remove unnecessary imports from text/cleaners.py

Install requirements

pip install -r requirements.txt

Create datasets

Single speaker

"n_speakers" should be 0 in config.json

path/to/XXX.wav|transcript
  • Example
dataset/001.wav|こんにちは。

Mutiple speakers

Speaker id should start from 0

path/to/XXX.wav|speaker id|transcript
  • Example
dataset/001.wav|0|こんにちは。

Preprocess

If you have done this, set "cleaned_text" to true in config.json

# Single speaker
python preprocess.py --text_index 1 --filelists path/to/filelist_train.txt path/to/filelist_val.txt

# Mutiple speakers
python preprocess.py --text_index 2 --filelists path/to/filelist_train.txt path/to/filelist_val.txt

Build monotonic alignment search

cd monotonic_align
python setup.py build_ext --inplace
cd ..

Train

# Single speaker
python train.py -c <config> -m <folder>

# Mutiple speakers
python train_ms.py -c <config> -m <folder>

Inference

Online

See inference.ipynb

Offline

See MoeGoe

Running in Docker

docker run -itd --gpus all --name "Container name" -e NVIDIA_DRIVER_CAPABILITIES=compute,utility -e NVIDIA_VISIBLE_DEVICES=all "Image name"

About

VITS implementation of Japanese, Chinese, Korean, Sanskrit and Thai.. to be converted to Sinhala

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • Python 88.3%
  • Jupyter Notebook 8.2%
  • C++ 2.5%
  • Other 1.0%