Stars
🇨🇳 Chinese sticker pack,More joy / 表情包的博物馆, Github最有毒的仓库, 中国表情包大集合, 聚欢乐~
Multi-lingual large voice generation model, providing inference, training and deployment full-stack ability.
中文LLaMA&Alpaca大语言模型+本地CPU/GPU训练部署 (Chinese LLaMA & Alpaca LLMs)
Next-generation TTS model using flow-matching and DiT, inspired by Stable Diffusion 3
Zero-Shot Speech Editing and Text-to-Speech in the Wild
Repo for counting stars and contributing. Press F to pay respect to glorious developers.
1 min voice data can also be used to train a good TTS model! (few shot voice cloning)
MindSpore online courses: Step into LLM
Llama中文社区,Llama3在线体验和微调模型已开放,实时汇总最新Llama3学习资料,已将所有代码更新适配Llama3,构建最好的中文Llama大模型,完全开源可商用
Course to get into Large Language Models (LLMs) with roadmaps and Colab notebooks.
AudioGPT: Understanding and Generating Speech, Music, Sound, and Talking Head
Amphion (/æmˈfaɪən/) is a toolkit for Audio, Music, and Speech Generation. Its purpose is to support reproducible research and help junior researchers and engineers get started in the field of audi…
🤗 Transformers: State-of-the-art Machine Learning for Pytorch, TensorFlow, and JAX.
BDDM: Bilateral Denoising Diffusion Models for Fast and High-Quality Speech Synthesis
This repository is an implementation of this article: https://arxiv.org/pdf/2107.03312.pdf
PyTorch implementation of VALL-E(Zero-Shot Text-To-Speech), Reproduced Demo https://lifeiteng.github.io/valle/index.html
Implementation of Natural Speech 2, Zero-shot Speech and Singing Synthesizer, in Pytorch
DeepSpeed is a deep learning optimization library that makes distributed training and inference easy, efficient, and effective.
KAN-TTS is a speech-synthesis training framework, please try the demos we have posted at https://modelscope.cn/models?page=1&tasks=text-to-speech
A timeline of the latest AI models for audio generation, starting in 2023!
High-Resolution Image Synthesis with Latent Diffusion Models
🍀 Pytorch implementation of various Attention Mechanisms, MLP, Re-parameter, Convolution, which is helpful to further understand papers.⭐⭐⭐
An unofficial PyTorch implementation of the audio LM VALL-E
Large-scale Self-supervised Pre-training Across Tasks, Languages, and Modalities
Official Implement of Multi-Stage Multi-Codebook (MSMC) TTS
A Collection of Variational Autoencoders (VAE) in PyTorch.