SoftVC VITS Singing Voice Conversion Fork

A fork of so-vits-svc with realtime support and greatly improved interface. Based on branch 4.0 (v1) and the models are compatible.

Features not available in the original repo

Realtime voice conversion (enhanced in v1.1.0)
More accurate pitch estimation using CREPE
GUI available
Unified command-line interface (no need to run Python scripts)
Ready to use just by installing with pip.
Automatically download pretrained base model and HuBERT model
Code completely formatted with black, isort, autoflake etc.
Other minor differences

Installation

One click easy installation

Creating a Virtual Environment

Install

Install this via pip (or your favourite package manager that uses pip):

python -m pip install -U pip setuptools wheel
pip install -U torch torchaudio --index-url https://download.pytorch.org/whl/cu117
pip install -U so-vits-svc-fork

If no GPU is available, simply remove pip install -U torch torchaudio --index-url https://download.pytorch.org/whl/cu117.
If you are using an AMD GPU on Linux, replace --index-url https://download.pytorch.org/whl/cu117 with --index-url https://download.pytorch.org/whl/rocm5.4.2. AMD GPUs are not supported on Windows (#120).
If fairseq raises an error:
- If it prompts Microsoft C++ Build Tools is not installed. please install it.
- If it prompts that some dll is missing, reinstalling Microsoft Visual C++ 2022 and Windows SDK may help.

Update

Please update this package regularly to get the latest features and bug fixes.

pip install -U so-vits-svc-fork

Usage

Inference

GUI

GUI launches with the following command:

svcg

CLI

Realtime (from microphone)

svc vc --model-path <model-path>

File

svc --model-path <model-path> source.wav

Pretrained models are available on HuggingFace.

Notes

If using WSL, please note that WSL requires additional setup to handle audio and the GUI will not work without finding an audio device.
In real-time inference, if there is noise on the inputs, the HuBERT model will react to those as well. Consider using realtime noise reduction applications such as RTX Voice in this case.

Training

Before training

If your dataset has BGM, please remove the BGM using software such as Ultimate Vocal Remover. 3_HP-Vocal-UVR.pth or UVR-MDX-NET Main is recommended. ¹
If your dataset is a long audio file with multiple speakers, use svc sd to split the dataset into multiple files (using pyannote.audio). Further manual classification may be necessary due to accuracy issues. If speakers speak with a variety of speech styles, set --min-speakers larger than the actual number of speakers. Due to unresolved dependencies, please install pyannote.audio manually: pip install pyannote-audio.
If your dataset is a long audio file with a single speaker, use svc split to split the dataset into multiple files (using librosa).

Google Colab

Local

Place your dataset like dataset_raw/{speaker_id}/**/{wav_file}.{any_format} (subfolders and non-ASCII filenames are acceptable) and run:

svc pre-resample
svc pre-config
svc pre-hubert
svc train -t

Notes

Dataset audio duration per file should be <~ 10s or VRAM will run out.
To change the f0 inference method to CREPE, replace svc pre-hubert with svc pre-hubert -fm crepe. You may need to reduce --n-jobs due to performance issues.
It is recommended to change the batch_size in config.json before the train command to match the VRAM capacity. The default value is optimized for Tesla T4 (16GB VRAM), but training is possible without that much VRAM.
Silence removal and volume normalization are automatically performed (as in the upstream repo) and are not required.

Further help

For more details, run svc -h or svc <subcommand> -h.

> svc -h
Usage: svc [OPTIONS] COMMAND [ARGS]...

  so-vits-svc allows any folder structure for training data.
  However, the following folder structure is recommended.
      When training: dataset_raw/{speaker_name}/**/{wav_name}.{any_format}
      When inference: configs/44k/config.json, logs/44k/G_XXXX.pth
  If the folder structure is followed, you DO NOT NEED TO SPECIFY model path, config path, etc.
  (The latest model will be automatically loaded.)
  To train a model, run pre-resample, pre-config, pre-hubert, train.
  To infer a model, run infer.

Options:
  -h, --help  Show this message and exit.

Commands:
  clean          Clean up files, only useful if you are using the default file structure
  infer          Inference
  onnx           Export model to onnx
  pre-config     Preprocessing part 2: config
  pre-hubert     Preprocessing part 3: hubert If the HuBERT model is not found, it will be...
  pre-resample   Preprocessing part 1: resample
  pre-sd         Speech diarization using pyannote.audio
  pre-split      Split audio files into multiple files
  train          Train model If D_0.pth or G_0.pth not found, automatically download from hub.
  train-cluster  Train k-means clustering
  vc             Realtime inference from microphone

External Links

Video Tutorial

Contributors ✨

Thanks goes to these wonderful people (emoji key):

_34j 💻 🤔 📖 💡 🚇 🚧 👀 ⚠️ ✅ 📣 🐛	_{GarrettConway} 💻 🐛 📖	_BlueAmulet 🤔 💬 💻	_{ThrowawayAccount01} 🐛	_緋 📖 🐛	_Lordmau5 🐛 💻	_DL909 🐛
_Satisfy256 🐛	_{Pierluigi Zagaria} 📓	_{ruckusmattster} 🐛	_Desuka-art 🐛	_heyfixit 📖	_{Nerdy Rodent} 📹	_谢宇 📖
_ColdCawfee 🐛

This project follows the all-contributors specification. Contributions of any kind welcome!

https://ytpmv.info/how-to-use-uvr/ ↩

Name		Name	Last commit message	Last commit date
Latest commit History 150 Commits
.github		.github
.idea		.idea
docs		docs
easy-installation		easy-installation
notebooks		notebooks
src/so_vits_svc_fork		src/so_vits_svc_fork
tests		tests
.all-contributorsrc		.all-contributorsrc
.copier-answers.yml		.copier-answers.yml
.dockerignore		.dockerignore
.editorconfig		.editorconfig
.flake8		.flake8
.gitignore		.gitignore
.gitpod.yml		.gitpod.yml
.pre-commit-config.yaml		.pre-commit-config.yaml
.readthedocs.yml		.readthedocs.yml
CHANGELOG.md		CHANGELOG.md
CONTRIBUTING.md		CONTRIBUTING.md
Dockerfile		Dockerfile
LICENSE		LICENSE
README.md		README.md
README_zh_CN.md		README_zh_CN.md
commitlint.config.js		commitlint.config.js
poetry.lock		poetry.lock
pyproject.toml		pyproject.toml
renovate.json		renovate.json
setup.py		setup.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

SoftVC VITS Singing Voice Conversion Fork

Features not available in the original repo

Installation

One click easy installation

Creating a Virtual Environment

Install

Update

Usage

Inference

GUI

CLI

Notes

Training

Before training

Google Colab

Local

Notes

Further help

External Links

Contributors ✨

About

Releases

Packages

Languages

License

HungryKonata/so-vits-svc-GUI

Folders and files

Latest commit

History

Repository files navigation

SoftVC VITS Singing Voice Conversion Fork

Features not available in the original repo

Installation

One click easy installation

Creating a Virtual Environment

Install

Update

Usage

Inference

GUI

CLI

Notes

Training

Before training

Google Colab

Local

Notes

Further help

External Links

Contributors ✨

Footnotes

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages