This is a demonstration on how to produce speech in a particular emotion from text, this is achieved by fine tuning a TTS model on emotion labelled speech data, formulating it as a multi-modal prob…

Jupyter Notebook 4 2 Updated May 25, 2023

aris-ai / Audio-and-text-based-emotion-recognition

A multimodal approach on emotion recognition using audio and text.

Jupyter Notebook 155 29 Updated Jun 15, 2020

Vaibhavs10 / open-tts-tracker

1,083 69 Updated Jun 21, 2024

Plachtaa / VALL-E-X

An open source implementation of Microsoft's VALL-E X zero-shot TTS model. Demo is available in https://plachtaa.github.io/vallex/

Python 7,600 757 Updated Feb 11, 2024

ml-explore / mlx-examples

Examples in the MLX framework

Python 6,020 852 Updated Oct 12, 2024

nbertagnolli / counsel-chat

This repository holds the code for working with data from counselchat.com

Jupyter Notebook 144 55 Updated Jun 17, 2023

wangermeng2021 / llm-webui

A Gradio web UI for Large Language Models. Supports LoRA/QLoRA finetuning,RAG(Retrieval-augmented generation) and Chat

Python 32 5 Updated Nov 26, 2023

promptslab / LLMtuner

FineTune LLMs in few lines of code (Text2Text, Text2Speech, Speech2Text)

Python 225 13 Updated Jan 13, 2024

mozilla / TTS

🤖 💬 Deep learning for Text to Speech (Discussion forum: https://discourse.mozilla.org/c/tts)

Jupyter Notebook 9,289 1,243 Updated Nov 9, 2023

NVIDIA / tacotron2

Tacotron 2 - PyTorch implementation with faster-than-realtime inference

Jupyter Notebook 5,063 1,379 Updated Jun 12, 2024

kakaobrain / g2pm

A Neural Grapheme-to-Phoneme Conversion Package for Mandarin Chinese Based on a New Open Benchmark Dataset

Python 336 72 Updated Dec 24, 2021

getalp / BibleNet

BibleNet

Python 1 Updated Feb 17, 2020

NVIDIA / nv-wavenet

Reference implementation of real-time autoregressive wavenet inference

Cuda 735 126 Updated Jan 19, 2021

getalp / mass-dataset

MaSS - Multilingual corpus of Sentence-aligned Spoken utterances

Python 48 4 Updated Sep 16, 2024

racai-ai / TEPROLIN

This is the TEPROLIN Romanian text processing platform, developed in the ReTeRom project.

Perl 4 Updated Mar 31, 2022

AI4Bharat / Indic-TTS

Text-to-Speech for languages of India

Jupyter Notebook 140 32 Updated Aug 7, 2023

numediart / EmoV-DB

The Emotional Voices Database: Towards Controlling the Emotional Expressiveness in Voice Generation Systems

Python 251 19 Updated Oct 10, 2023

midas-research / audino

Open source audio annotation tool for humans

JavaScript 1,054 128 Updated Sep 9, 2024

Rongjiehuang / GenerSpeech

PyTorch Implementation of GenerSpeech (NeurIPS'22): a text-to-speech model towards zero-shot style transfer of OOD custom voice.

Python 316 45 Updated Feb 9, 2024

yl4579 / StarGANv2-VC

StarGANv2-VC: A Diverse, Unsupervised, Non-parallel Framework for Natural-Sounding Voice Conversion

Python 480 107 Updated May 16, 2023

NVIDIA / mellotron

Mellotron: a multispeaker voice synthesis model based on Tacotron 2 GST that can make a voice emote and sing without emotive or singing training data

Jupyter Notebook 854 184 Updated Jul 22, 2023

louisfb01 / start-machine-learning

A complete guide to start and improve in machine learning (ML), artificial intelligence (AI) in 2024 without ANY background in the field and stay up-to-date with the latest news and state-of-the-ar…

4,426 577 Updated Jul 25, 2024