Name	Name	Last commit message	Last commit date
Latest commit History 108 Commits
ext	ext
g2p_train	g2p_train
res	res
.gitignore	.gitignore
EnglishPhoneticProcessor.cpp	EnglishPhoneticProcessor.cpp
EnglishPhoneticProcessor.h	EnglishPhoneticProcessor.h
FastSpeech2.cpp	FastSpeech2.cpp
FastSpeech2.h	FastSpeech2.h
LICENSE.md	LICENSE.md
MultiBandMelGAN.cpp	MultiBandMelGAN.cpp
MultiBandMelGAN.h	MultiBandMelGAN.h
README.md	README.md
TensorVox.pro	TensorVox.pro
TextTokenizer.cpp	TextTokenizer.cpp
TextTokenizer.h	TextTokenizer.h
Voice.cpp	Voice.cpp
Voice.h	Voice.h
VoxCommon.cpp	VoxCommon.cpp
VoxCommon.hpp	VoxCommon.hpp
attention.cpp	attention.cpp
attention.h	attention.h
batchdenoisedlg.cpp	batchdenoisedlg.cpp
batchdenoisedlg.h	batchdenoisedlg.h
batchdenoisedlg.ui	batchdenoisedlg.ui
main.cpp	main.cpp
mainwindow.cpp	mainwindow.cpp
mainwindow.h	mainwindow.h
mainwindow.ui	mainwindow.ui
melgen.cpp	melgen.cpp
melgen.h	melgen.h
modelinfodlg.cpp	modelinfodlg.cpp
modelinfodlg.h	modelinfodlg.h
modelinfodlg.ui	modelinfodlg.ui
phddialog.cpp	phddialog.cpp
phddialog.h	phddialog.h
phddialog.ui	phddialog.ui
phonemizer.cpp	phonemizer.cpp
phonemizer.h	phonemizer.h
phoneticdict.cpp	phoneticdict.cpp
phoneticdict.h	phoneticdict.h
phonetichighlighter.cpp	phonetichighlighter.cpp
phonetichighlighter.h	phonetichighlighter.h
spectrogram.cpp	spectrogram.cpp
spectrogram.h	spectrogram.h
stdres.qrc	stdres.qrc
tacotron2.cpp	tacotron2.cpp
tacotron2.h	tacotron2.h
tfg2p.cpp	tfg2p.cpp
tfg2p.h	tfg2p.h
track.cpp	track.cpp
track.h	track.h
voicemanager.cpp	voicemanager.cpp
voicemanager.h	voicemanager.h
voxer.cpp	voxer.cpp
voxer.h	voxer.h
winicon.ico	winicon.ico

Name

Last commit message

Last commit date

ext

g2p_train

res

.gitignore

EnglishPhoneticProcessor.cpp

EnglishPhoneticProcessor.h

phonetichighlighter.cpp

phonetichighlighter.h

TensorVox

TensorVox is an application designed to enable user-friendly and lightweight neural speech synthesis in the desktop, aimed at increasing accessibility to such technology.

Powered mainly by TensorFlowTTS and also by Coqui-TTS, it is written in pure C++/Qt, using the Tensorflow C API for interacting with the models. This way, we can perform inference without having to install gigabytes worth of Python libraries, just a 100MB DLL.

Try it out

Detailed guide in Google Docs

Grab a copy from the releases, extract the .zip and check the Google Drive folder for models and installation instructions

If you're interested in using your own model, first you need to train then export it.

Supported architectures

TensorVox supports models from two main repos:

TensorFlowTTS: FastSpeech2, Tacotron2, both char and phoneme based and Multi-Band MelGAN. Here's a Colab notebook demonstrating how to export the LJSpeech pretrained, char-based Tacotron2 model:
Coqui-TTS: Tacotron2 (phoneme-based IPA) and Multi-Band MelGAN, after converting from PyTorch to Tensorflow. Here's a notebook showing how to export the LJSpeech DDC model:

Those two examples should provide you with enough guidance to understand what is needed. If you're looking to train a model specifically for this purpose then I recommend TensorFlowTTS, as it is the one with the best support. As for languages, out-of-the-box support is provided for English (both Coqui and TFTTS), German and Spanish (only TensorFlowTTS); that is, you won't have to modify any code.

Build instructions

Currently, only Windows 10 x64 (although I've heard reports of it running on 8.1) is supported.

Requirements:

Qt Creator
MSVC 2017 (v141) compiler

Primed build (with all provided libraries):

Download precompiled binary dependencies and includes
Unzip it so that the deps folder is in the same place as the .pro and main source files.
Open the project with Qt Creator, add your compiler and compile

Note that to try your shiny new executable you'll need to download the program as described above and insert the models folder where your new build is output.

TODO: Add instructions for compile from scratch.

Externals (and thanks)

Tensorflow C API: https://www.tensorflow.org/install/lang_c
CppFlow (TF C API -> C++ wrapper): https://github.com/serizba/cppflow
AudioFile (for WAV export): https://github.com/adamstark/AudioFile
Frameless Dark Style Window: https://github.com/Jorgen-VikingGod/Qt-Frameless-Window-DarkStyle
JSON for modern C++: https://github.com/nlohmann/json
r8brain-free-src (Resampling): https://github.com/avaneev/r8brain-free-src
rnnoise (CMake version, denoising output): https://github.com/almogh52/rnnoise-cmake
Logitech LED Illumination SDK (Mouse RGB integration): https://www.logitechg.com/en-us/innovation/developer-lab.html
QCustomPlot : https://www.qcustomplot.com/index.php/introduction
libnumbertext : https://github.com/Numbertext/libnumbertext

Contact

You can open an issue here or join the Discord server and discuss/ask anything there

For media/licensing/any other formal stuff inquiries, send to this email: [email protected]

Note about licensing

This program itself is MIT licensed, but for the models you use, their license terms apply. For example, if you're in Vietnam and using TensorFlowTTS models, you'll have to check here for some details

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

TensorVox

Try it out

Supported architectures

Build instructions

Externals (and thanks)

Contact

Note about licensing

About

Uh oh!

Releases

Packages

Languages

License

gitter-badger/TensorVox

Folders and files

Latest commit

History

Repository files navigation

TensorVox

Try it out

Supported architectures

Build instructions

Externals (and thanks)

Contact

Note about licensing

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages