Skip to content
View donstang's full-sized avatar
😊
😊

Block or report donstang

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Please don't include any personal information such as legal names or email addresses. Maximum 100 characters, markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
Showing results

Build your own AI friend

C++ 11,425 2,174 Updated Apr 14, 2025

基于语言学本体构建,全面覆盖汉语多音字、音变等现象的高效中文TTS数据集。A linguistically grounded and comprehensive Chinese TTS dataset, efficiently covering Chinese polyphonic characters, phonological changes, and more.

23 2 Updated Aug 13, 2024

State-of-the-art audio codec with 90x compression factor. Supports 44.1kHz, 24kHz, and 16kHz mono/stereo audio.

Python 1,360 133 Updated Jul 11, 2024
Jupyter Notebook 18 3 Updated Oct 30, 2024
Python 8 2 Updated Dec 23, 2024

Streaming Vocos

Python 24 4 Updated Jan 9, 2025
Python 4,566 310 Updated Apr 12, 2025

Towards Human-Sounding Speech

Python 4,051 326 Updated Apr 15, 2025

zero-shot voice conversion & singing voice conversion, with real-time support

Python 2,206 246 Updated Mar 24, 2025

🦉 OWL: Optimized Workforce Learning for General Multi-Agent Assistance in Real-World Task Automation

Python 15,566 1,836 Updated Apr 15, 2025

🌟 The Multi-Agent Framework: First AI Software Company, Towards Natural Language Programming

Python 54,589 6,480 Updated Mar 31, 2025

使用vllm加速cosyvoice2的推理

Jupyter Notebook 197 22 Updated Apr 13, 2025

An Industrial-Level Controllable and Efficient Zero-Shot Text-To-Speech System

Python 909 76 Updated Apr 15, 2025

Vocos: Closing the gap between time-domain and Fourier-based neural vocoders for high-quality audio synthesis

Python 915 108 Updated Aug 7, 2024

FunCodec is a research-oriented toolkit for audio quantization and downstream applications, such as text-to-speech synthesis, music generation et.al.

Python 396 33 Updated Jan 25, 2024

[ICLR 2025] SOTA discrete acoustic codec models with 40/75 tokens per second for audio language modeling

Python 1,107 86 Updated Mar 2, 2025

[ICASSP 2024] 🍵 Matcha-TTS: A fast TTS architecture with conditional flow matching

Jupyter Notebook 970 125 Updated Mar 31, 2025

HiFi-GAN: Generative Adversarial Networks for Efficient and High Fidelity Speech Synthesis

Python 2,108 526 Updated Jul 27, 2024

🐸💬 - a deep learning toolkit for Text-to-Speech, battle-tested in research and production

Python 39,335 4,974 Updated Aug 16, 2024

Official PyTorch implementation of BigVGAN (ICLR 2023)

Python 998 127 Updated Sep 5, 2024
Python 820 48 Updated Mar 24, 2025

A Conversational Speech Generation Model

Python 12,514 1,127 Updated Mar 27, 2025

No fortress, purely open ground. OpenManus is Coming.

Python 43,306 7,425 Updated Apr 15, 2025

Best practice TTS based on BERT and VITS with some Natural Speech Features Of Microsoft; Support ONNX streaming out!

Python 1,183 172 Updated Feb 5, 2024
C++ 5,416 793 Updated Apr 8, 2025

https://hf.co/hexgrad/Kokoro-82M

JavaScript 2,322 248 Updated Apr 10, 2025

GPT-4o-level, real-time spoken dialogue system.

Python 313 22 Updated Jan 27, 2025

[ICASSP 2024] This is the official code for "VoiceFlow: Efficient Text-to-Speech with Rectified Flow Matching"

Python 341 22 Updated Sep 3, 2024
Next