Stars
The open source implementation of the cross attention mechanism from the paper: "JOINTLY TRAINING LARGE AUTOREGRESSIVE MULTIMODAL MODELS"
[ACL 2024] Official PyTorch code for extracting features and training downstream models with emotion2vec: Self-Supervised Pre-Training for Speech Emotion Representation
Memory-Guided Diffusion for Expressive Talking Video Generation
GaussianSpeech: Audio-Driven Gaussian Avatars
Unofficial implementation of the paper "One-Shot Free-View Neural Talking-Head Synthesis for Video Conferencing" (CVPR 2021 Oral)
Official code release of "DEEPTalk: Dynamic Emotion Embedding for Probabilistic Speech-Driven 3D Face Animation" [AAAI2025]
DiffPoseTalk: Speech-Driven Stylistic 3D Facial Animation and Head Pose Generation via Diffusion Models
FLAME head tracker for single image or multi-view-image reconstruction and video-based tracking.
Official implementation of “GaussianTalker: Real-Time High-Fidelity Talking Head Synthesis with Audio-Driven 3D Gaussian Splatting” by Kyusun Cho, Joungbin Lee, Heeji Yoon, Yeobin Hong, Jaehoon Ko,…
DWFormer: Dynamic Window Transformer for Speech Emotion Recognition(ICASSP 2023 Oral)
[CVPR 2024] Official repository for "Gaussian Head Avatar: Ultra High-fidelity Head Avatar via Dynamic Gaussians"
This codebase demonstrates how to synthesize realistic 3D character animations given an arbitrary speech signal and a static character mesh.
EVFHQ handles the downloading of video files from YouTube.
Offical implement of Dynamic Frame Avatar with Non-autoregressive Diffusion Framework for talking head Video Generation
A variant of Transformer-XL where the memory is updated not with a queue, but with attention
A pytorch implementation of paper "Motion Flow Matching for Human Motion Synthesis and Editing"
A neural-network-based generative model for video-game characters animations
Clean baseline implementation of PPO using an episodic TransformerXL memory
A complete head tracking pipeline from videos to NeRF/3DGS-ready datasets.
🔥🔥🔥 Set the world of 3D faces on fire with INFERNO 🔥🔥🔥
Official implementation for the SIGGRAPH Asia 2024 paper SPARK: Self-supervised Personalized Real-time Monocular Face Capture
[NeurIPS 2024] Generalizable and Animatable Gaussian Head Avatar
😎 Awesome lists about Speech Emotion Recognition
MaTe3D: Mask-guided Text-based 3D-aware Portrait Editing
MuseTalk: Real-Time High Quality Lip Synchorization with Latent Space Inpainting