Stars
scikit-learn: machine learning in Python
The world's simplest facial recognition api for Python and the command line
1 min voice data can also be used to train a good TTS model! (few shot voice cloning)
TensorFlow code and pre-trained models for BERT
SoftVC VITS Singing Voice Conversion
Graph Neural Network Library for PyTorch
Large-scale Self-supervised Pre-training Across Tasks, Languages, and Modalities
Multi-lingual large voice generation model, providing inference, training and deployment full-stack ability.
AudioGPT: Understanding and Generating Speech, Music, Sound, and Talking Head
Pre-Training with Whole Word Masking for Chinese BERT(中文BERT-wwm系列模型)
Amphion (/æmˈfaɪən/) is a toolkit for Audio, Music, and Speech Generation. Its purpose is to support reproducible research and help junior researchers and engineers get started in the field of audi…
A Fundamental End-to-End Speech Recognition Toolkit and Open Source SOTA Pretrained Models, Supporting Speech Recognition, Voice Activity Detection, Text Post-processing etc.
vits2 backbone with multilingual-bert
An open source implementation of Microsoft's VALL-E X zero-shot TTS model. Demo is available in https://plachtaa.github.io/vallex/
EmotiVoice 😊: a Multi-Voice and Prompt-Controlled TTS Engine
VITS: Conditional Variational Autoencoder with Adversarial Learning for End-to-End Text-to-Speech
Pretrained Pytorch face detection (MTCNN) and facial recognition (InceptionResnet) models
Production First and Production Ready End-to-End Speech Recognition Toolkit
Ikaros-521 / AI-Vtuber
Forked from sandboxdream/AI-VtuberAI Vtuber是一个由 【ChatterBot/ChatGPT/claude/langchain/chatglm/text-gen-webui/闻达/千问/kimi/ollama】 驱动的虚拟主播【Live2D/UE/xuniren】,可以在 【Bilibili/抖音/快手/微信视频号/拼多多/斗鱼/YouTube/twitch/TikTok】 直播中与观众实时互动 或 直接在本地进行聊…
An unofficial PyTorch implementation of the audio LM VALL-E
Core Engine of Singing Voice Conversion & Singing Voice Clone
Fast and accurate automatic speech recognition (ASR) for edge devices
Pre-trained word vectors of 30+ languages
Detect and recognize the faces from camera / 调用摄像头进行人脸识别,支持多张人脸同时识别