Lists (1)
Sort Name ascending (A-Z)
Stars
The official Open-Asset-Importer-Library Repository. Loads 40+ 3D-file-formats into one unified and clean data structure.
Official code for "F5-TTS: A Fairytaler that Fakes Fluent and Faithful Speech with Flow Matching"
A library for detecting and resolving intersections between two surface meshes.
A procedural geometry generation library for C++11
Easily train a good VC model with voice data <= 10 mins!
StyleTTS 2: Towards Human-Level Text-to-Speech through Style Diffusion and Adversarial Training with Large Speech Language Models
Unified-Modal Speech-Text Pre-Training for Spoken Language Processing
An Open Source text-to-speech system built by inverting Whisper.
A generative speech model for daily dialogue.
A multi-voice TTS system trained with an emphasis on quality
Foundational model for human-like, expressive TTS
π¦ PostHog provides open-source web & product analytics, session recording, feature flagging and A/B testing that you can self-host. Get started - free.
Live, low-latency 2D and 3D tracking from single or multiple high-speed cameras
Learning Locker - The Open Source Learning Record Store. Started in 2014.
[CVPR 2024 Highlight] The official repo for "GaussianAvatars: Photorealistic Head Avatars with Rigged 3D Gaussians"
πΈπ¬ - a deep learning toolkit for Text-to-Speech, battle-tested in research and production
[SIGGRAPH Asia 2022] VideoReTalking: Audio-based Lip Synchronization for Talking Head Video Editing In the Wild
Talking Head (3D): A JavaScript class for real-time lip-sync using Ready Player Me full-body 3D avatars.
Industry leading face manipulation platform
Real time background replacement using DeepLabv3 MobileNetv2 model for person segmentation and OpenCV for image processing.
π¦π Build context-aware reasoning applications
[CVPR2023] The implementation for "DiffTalk: Crafting Diffusion Models for Generalized Audio-Driven Portraits Animation"
High-quality multi-lingual text-to-speech library by MyShell.ai. Support English, Spanish, French, Chinese, Japanese and Korean.
1 min voice data can also be used to train a good TTS model! (few shot voice cloning)
A simple, high-quality voice conversion tool focused on ease of use and performance.
serp-ai / bark-with-voice-clone
Forked from suno-ai/barkπ Text-prompted Generative Audio Model - With the ability to clone voices
EmotiVoice π: a Multi-Voice and Prompt-Controlled TTS Engine
Servo, the embeddable, independent, memory-safe, modular, parallel web rendering engine