donstang

Follow

😊

Dons Tang donstang

😊

Follow

8 followers · 139 following

wuhan ai research

Stars

78 / xiaozhi-esp32

Build your own AI friend

C++ 11,425 2,174 Updated Apr 14, 2025

danielwei0214 / Chinese-TTS-Dataset

基于语言学本体构建，全面覆盖汉语多音字、音变等现象的高效中文TTS数据集。A linguistically grounded and comprehensive Chinese TTS dataset, efficiently covering Chinese polyphonic characters, phonological changes, and more.

23 2 Updated Aug 13, 2024

descriptinc / descript-audio-codec

State-of-the-art audio codec with 90x compression factor. Supports 44.1kHz, 24kHz, and 16kHz mono/stereo audio.

Python 1,360 133 Updated Jul 11, 2024

pengzhendong / streaming-ChatTTS

Jupyter Notebook 18 3 Updated Oct 30, 2024

pengzhendong / streaming-dvae

Python 8 2 Updated Dec 23, 2024

pengzhendong / streaming-vocos

Streaming Vocos

Python 24 4 Updated Jan 9, 2025

bytedance / MegaTTS3

Python 4,566 310 Updated Apr 12, 2025

canopyai / Orpheus-TTS

Towards Human-Sounding Speech

Python 4,051 326 Updated Apr 15, 2025

airockchip / rknn_model_zoo

C 1,427 251 Updated Apr 9, 2025

Plachtaa / seed-vc

zero-shot voice conversion & singing voice conversion, with real-time support

Python 2,206 246 Updated Mar 24, 2025

camel-ai / owl

🦉 OWL: Optimized Workforce Learning for General Multi-Agent Assistance in Real-World Task Automation

Python 15,566 1,836 Updated Apr 15, 2025

geekan / MetaGPT

🌟 The Multi-Agent Framework: First AI Software Company, Towards Natural Language Programming

Python 54,589 6,480 Updated Mar 31, 2025

qi-hua / async_cosyvoice

使用vllm加速cosyvoice2的推理

Jupyter Notebook 197 22 Updated Apr 13, 2025

index-tts / index-tts

An Industrial-Level Controllable and Efficient Zero-Shot Text-To-Speech System

Python 909 76 Updated Apr 15, 2025

gemelo-ai / vocos

Vocos: Closing the gap between time-domain and Fourier-based neural vocoders for high-quality audio synthesis

Python 915 108 Updated Aug 7, 2024

modelscope / FunCodec

FunCodec is a research-oriented toolkit for audio quantization and downstream applications, such as text-to-speech synthesis, music generation et.al.

Python 396 33 Updated Jan 25, 2024

jishengpeng / WavTokenizer

[ICLR 2025] SOTA discrete acoustic codec models with 40/75 tokens per second for audio language modeling

Python 1,107 86 Updated Mar 2, 2025

shivammehta25 / Matcha-TTS

[ICASSP 2024] 🍵 Matcha-TTS: A fast TTS architecture with conditional flow matching

Jupyter Notebook 970 125 Updated Mar 31, 2025

jik876 / hifi-gan

HiFi-GAN: Generative Adversarial Networks for Efficient and High Fidelity Speech Synthesis

Python 2,108 526 Updated Jul 27, 2024

coqui-ai / xtts-streaming-server

Python 325 98 Updated Jun 26, 2024

coqui-ai / TTS

🐸💬 - a deep learning toolkit for Text-to-Speech, battle-tested in research and production

Python 39,335 4,974 Updated Aug 16, 2024

NVIDIA / BigVGAN

Official PyTorch implementation of BigVGAN (ICLR 2023)

Python 998 127 Updated Sep 5, 2024

HumanMLLM / R1-Omni

Python 820 48 Updated Mar 24, 2025

SesameAILabs / csm

A Conversational Speech Generation Model

Python 12,514 1,127 Updated Mar 27, 2025

mannaandpoem / OpenManus

No fortress, purely open ground. OpenManus is Coming.

Python 43,306 7,425 Updated Apr 15, 2025

PlayVoice / vits_chinese

Best practice TTS based on BERT and VITS with some Natural Speech Features Of Microsoft; Support ONNX streaming out!

Python 1,183 172 Updated Feb 5, 2024

GuijiAI / duix.ai

C++ 5,416 793 Updated Apr 8, 2025

hexgrad / kokoro

https://hf.co/hexgrad/Kokoro-82M

JavaScript 2,322 248 Updated Apr 10, 2025

OpenMOSS / SpeechGPT-2.0-preview

GPT-4o-level, real-time spoken dialogue system.

Python 313 22 Updated Jan 27, 2025

X-LANCE / VoiceFlow-TTS

[ICASSP 2024] This is the official code for "VoiceFlow: Efficient Text-to-Speech with Rectified Flow Matching"

Python 341 22 Updated Sep 3, 2024