Lists (1)
Sort Name ascending (A-Z)
Stars
1 min voice data can also be used to train a good TTS model! (few shot voice cloning)
๐ธ๐ฌ - a deep learning toolkit for Text-to-Speech, battle-tested in research and production
A generative speech model for daily dialogue.
Instant voice cloning by MIT and MyShell. Audio foundation model.
Easily train a good VC model with voice data <= 10 mins!
๐ฆ PostHog provides open-source web & product analytics, session recording, feature flagging and A/B testing that you can self-host. Get started - free.
Industry leading face manipulation platform
High-Resolution 3D Human Digitization from A Single Image.
Official code for "F5-TTS: A Fairytaler that Fakes Fluent and Faithful Speech with Flow Matching"
EmotiVoice ๐: a Multi-Voice and Prompt-Controlled TTS Engine
[SIGGRAPH Asia 2022] VideoReTalking: Audio-based Lip Synchronization for Talking Head Video Editing In the Wild
High-quality multi-lingual text-to-speech library by MyShell.ai. Support English, Spanish, French, Chinese, Japanese and Korean.
StyleTTS 2: Towards Human-Level Text-to-Speech through Style Diffusion and Adversarial Training with Large Speech Language Models
Foundational model for human-like, expressive TTS
A simple, high-quality voice conversion tool focused on ease of use and performance.
Using neural networks to build an automatic number plate recognition system
Unified-Modal Speech-Text Pre-Training for Spoken Language Processing
[CVPR 2024 Highlight] The official repo for "GaussianAvatars: Photorealistic Head Avatars with Rigged 3D Gaussians"
[CVPR2023] The implementation for "DiffTalk: Crafting Diffusion Models for Generalized Audio-Driven Portraits Animation"
Open Source and Free License Plate Recognition Software
Real time background replacement using DeepLabv3 MobileNetv2 model for person segmentation and OpenCV for image processing.