Amphion (/æmˈfaɪən/) is a toolkit for Audio, Music, and Speech Generation. Its purpose is to support reproducible research and help junior researchers and engineers get started in the field of audi…

Python 8,645 672 Updated Mar 3, 2025

yangcaoai / CoDA_NeurIPS2023

Official code for NeurIPS2023 paper: CoDA: Collaborative Novel Box Discovery and Cross-modal Alignment for Open-vocabulary 3D Object Detection

Jupyter Notebook 196 17 Updated Jan 24, 2025

coqui-ai / TTS

🐸💬 - a deep learning toolkit for Text-to-Speech, battle-tested in research and production

Python 38,267 4,791 Updated Aug 16, 2024

hahahumble / speechgpt

💬 SpeechGPT is a web application that enables you to converse with ChatGPT.

TypeScript 2,761 393 Updated Oct 16, 2023

lucidrains / naturalspeech2-pytorch

Implementation of Natural Speech 2, Zero-shot Speech and Singing Synthesizer, in Pytorch

Python 1,314 104 Updated Sep 24, 2023

lifeiteng / vall-e

PyTorch implementation of VALL-E(Zero-Shot Text-To-Speech), Reproduced Demo https://lifeiteng.github.io/valle/index.html

Python 2,100 321 Updated Nov 14, 2023

enhuiz / vall-e

An unofficial PyTorch implementation of the audio LM VALL-E

Python 2,987 417 Updated May 10, 2023

miguelvalente / whisperer

Go from raw audio files to a text-audio dataset automatically with OpenAI's Whisper.

Jupyter Notebook 135 12 Updated Aug 14, 2023

PlayVoice / VI-SVS

Singing Voice Synthesis based on VITS, different from VISinger

Python 188 31 Updated Nov 13, 2023

hhguo / MSMC-TTS

Official Implement of Multi-Stage Multi-Codebook (MSMC) TTS

Python 162 16 Updated Apr 10, 2024

choiHkk / VAEJETS

Conditional Variational Auto-Encoder with Jointly Training FastSpeech2 and HiFi-GAN for End to End Text to Speech

Jupyter Notebook 22 6 Updated Aug 11, 2022

choiHkk / VITSinger

Singing Voice Speech modeling test

Python 35 10 Updated Aug 16, 2022

jaywalnut310 / vits

VITS: Conditional Variational Autoencoder with Adversarial Learning for End-to-End Text-to-Speech

Python 7,203 1,315 Updated Dec 6, 2023

rishikksh20 / Avocodo-pytorch

Avocodo: Generative Adversarial Network for Artifact-free Vocoder

Python 117 15 Updated Jul 14, 2022

facebookresearch / vocoder-benchmark

A repository for benchmarking neural vocoders by their quality and speed.

Python 208 28 Updated Feb 26, 2025

wenet-e2e / wetts

Production First and Production Ready End-to-End Text-to-Speech Toolkit

Python 381 59 Updated May 30, 2024

jhuang448 / LyricsAlignment-MTL

Python 58 13 Updated Apr 18, 2023

salu133445 / deepperformer

Deep Performer: Score-to-audio music performance synthesis

SCSS 43 4 Updated Jun 26, 2023

MoonInTheRiver / DiffSinger

DiffSinger: Singing Voice Synthesis via Shallow Diffusion Mechanism (SVS & TTS); AAAI 2022; Official code

Python 4,410 729 Updated May 2, 2023

wenet-e2e / wenet

Production First and Production Ready End-to-End Speech Recognition Toolkit

Python 4,362 1,113 Updated Feb 25, 2025

nladuo / AI_beatmap_generator

尝试使用神经网络生成音乐游戏Malody的谱面。

Jupyter Notebook 47 12 Updated Feb 19, 2020

MTG / essentia

C++ library for audio and music analysis, description and synthesis, including Python bindings

C++ 2,993 550 Updated Jan 29, 2025

cbfinn / maml

Code for "Model-Agnostic Meta-Learning for Fast Adaptation of Deep Networks"

Python 2,610 614 Updated Jan 19, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

markyouyuren

Block or report markyouyuren

Stars

FunAudioLLM / InspireMusic

SWivid / F5-TTS

Plachtaa / FAcodec

jishengpeng / WavTokenizer

feizc / FluxMusic

haoheliu / AudioLDM-training-finetuning

RVC-Boss / GPT-SoVITS

open-mmlab / Amphion