Awesome Talking Face

This is a repository for organizing papres, codes and other resources related to talking face/head. Most papers are linked to the pdf address provided by "arXiv" or "OpenAccess". However, some papers require an academic license to browse. For example, IEEE, springer, and elsevier journal, etc.

🔆 This project is still on-going, pull requests are welcomed!!

If you have any suggestions (missing papers, new papers, key researchers or typos), please feel free to edit and pull a request. Just letting me know the title of papers can also be a big contribution to me. You can do this by open issue or contact me directly via email.

⭐ If you find this repo useful, please star it!!!

2022.09 Update!

Thanks for PR from everybody! From now on, I'll occasionally include some papers about video-driven talking face generation. Because I found that the community is trying to include the video-driven methods into the talking face generation scope, though it is originally termed as Face Reenactment.

So, if you are looking for video-driven talking face generation, I would suggest you have a star here, and go to search Face Reenactment, you'll find more :)

One more thing, please correct me if you find that there are any paper noted as arXiv paper has been accepted to some conferences or journals.

2021.11 Update!

I updated a batch of papers that appeared in the past few months. In this repo, I was intend to cover the audio-driven talking face generation works. However, I found several text-based research works are also very interesting. So I included them here. Enjoy it!

TO DO LIST

Papers

2D Video - Person independent

2024

VQTalker: Towards Multilingual Talking Avatars through Facial Motion Tokenization [arXiv 2024] Paper
PortraitTalk: Towards Customizable One-Shot Audio-to-Talking Face Generation [arXiv 2024] Paper
IF-MDM: Implicit Face Motion Diffusion Model for High-Fidelity Realtime Talking Head Generation [arXiv 2024] Paper Project
INFP: Audio-Driven Interactive Head Generation in Dyadic Conversations [arXiv 2024] Paper Project
MEMO: Memory-Guided Diffusion for Expressive Talking Video Generation [arXiv 2024] Paper Code Project
FLOAT: Generative Motion Latent Flow Matching for Audio-driven Talking Portrait [arXiv 2024] Paper Project
Hallo3: Highly Dynamic and Realistic Portrait Image Animation with Diffusion Transformer Networks [arXiv 2024] Paper
Synergizing Motion and Appearance: Multi-Scale Compensatory Codebooks for Talking Head Video Generation [arXiv 2024] Paper Code
Ditto: Motion-Space Diffusion for Controllable Realtime Talking Head Synthesis [arXiv 2024] Paper
LetsTalk: Latent Diffusion Transformer for Talking Video Synthesis [arXiv 2024] Paper Code Project
EmotiveTalk: Expressive Talking Head Generation through Audio Information Decoupling and Emotional Video Diffusion [arXiv 2024] Paper
LES-Talker: Fine-Grained Emotion Editing for Talking Head Generation in Linear Emotion Space [arXiv 2024] Paper Project
JoyVASA: Portrait and Animal Image Animation with Diffusion-Based Audio-Driven Facial Dynamics and Head Motion Generation [arXiv 2024] Paper Code
HelloMeme: Integrating Spatial Knitting Attentions to Embed High-Level and Fidelity-Rich Conditions in Diffusion Models [arXiv 2024] Paper Code
DAWN: Dynamic Frame Avatar with Non-autoregressive Diffusion Framework for Talking Head Video Generation [arXiv 2024] Paper Code ProjectPage
Takin-ADA: Emotion Controllable Audio-Driven Animation with Canonical and Landmark Loss Optimization [arXiv 2024] Paper
MuseTalk: Real-Time High Quality Lip Synchronization with Latent Space Inpainting [arXiv 2024] Paper Code
3D-Aware Text-driven Talking Avatar Generation [ECCV 2024] Paper
LaDTalk: Latent Denoising for Synthesizing Talking Head Videos with High Frequency Details [arXiv 2024] Paper
TalkinNeRF: Animatable Neural Fields for Full-Body Talking Humans [ECCVW 2024] Paper
JoyHallo: Digital human model for Mandarin [arXiv 2024] Paper Code
JEAN: Joint Expression and Audio-guided NeRF-based Talking Face Generation [BMVC 2024] Paper
StyleTalk++: A Unified Framework for Controlling the Speaking Styles of Talking Heads [TPAMI 2024] Paper
DiffTED: One-shot Audio-driven TED Talk Video Generation with Diffusion-based Co-speech Gestures [CVPRW 2024] Paper
EMOdiffhead: Continuously Emotional Control in Talking Head Generation via Diffusion [arXiv 2024] Paper
SVP: Style-Enhanced Vivid Portrait Talking Head Diffusion Model [arXiv 2024] Paper
SegTalker: Segmentation-based Talking Face Generation with Mask-guided Local Editing [arXiv 2024] Paper
Loopy: Taming Audio-Driven Portrait Avatar with Long-Term Motion Dependency [arXiv 2024] Paper ProjectPage
PoseTalk: Text-and-Audio-based Pose Control and Motion Refinement for One-Shot Talking Head Generation [arXiv 2024] Paper
CyberHost: Taming Audio-driven Avatar Diffusion Model with Region Codebook Attention [arXiv 2024] Paper ProjectPage
TalkLoRA: Low-Rank Adaptation for Speech-Driven Animation [arXiv 2024] Paper
S^3D-NeRF: Single-Shot Speech-Driven Neural Radiance Field for High Fidelity Talking Head Synthesis [arXiv 2024] Paper
FD2Talk: Towards Generalized Talking Head Generation with Facial Decoupled Diffusion Model [arXiv 2024] Paper
LivePortrait: Efficient Portrait Animation with Stitching and Retargeting Control [arXiv 2024] Paper ProjectPage Code
High-fidelity and Lip-synced Talking Face Synthesis via Landmark-based Diffusion Model [arXiv 2024] Paper
Landmark-guided Diffusion Model for High-fidelity and Temporally Coherent Talking Head Generation [arXiv 2024] Paper
LinguaLinker: Audio-Driven Portraits Animation with Implicit Facial Control Enhancement [arXiv 2024] Paper ProjectPage Code
Learning Online Scale Transformation for Talking Head Video Generation [arXiv 2024] Paper
EchoMimic: Lifelike Audio-Driven Portrait Animations through Editable Landmark Conditions [arXiv 2024] Paper ProjectPage GitHub
Hallo: Hierarchical Audio-Driven Visual Synthesis for Portrait Image Animation [arXiv 2024] Paper ProjectPage GitHub
RealTalk: Real-time and Realistic Audio-driven Face Generation with 3D Facial Prior-guided Identity Alignment Network [arXiv 2024] Paper
Emotional Conversation: Empowering Talking Faces with Cohesive Expression, Gaze and Pose Generation [arXiv 2024] Paper
Let's Go Real Talk: Spoken Dialogue Model for Face-to-Face Conversation [arXiv 2024] Paper ProjectPage
Make Your Actor Talk: Generalizable and High-Fidelity Lip Sync with Motion and Appearance Disentanglement [arXiv 2024] Paper ProjectPage
Controllable Talking Face Generation by Implicit Facial Keypoints Editing [arXiv 2024] Paper
InstructAvatar: Text-Guided Emotion and Motion Control for Avatar Generation [arXiv 2024] Paper ProjectPage
Faces that Speak: Jointly Synthesising Talking Face and Speech from Text [arXiv 2024] Paper ProjectPage
Listen, Disentangle, and Control: Controllable Speech-Driven Talking Head Generation [arXiv 2024] Paper
SwapTalk: Audio-Driven Talking Face Generation with One-Shot Customization in Latent Space [arXiv 2024] Paper ProjectPage
AniTalker: Animate Vivid and Diverse Talking Faces through Identity-Decoupled Facial Motion Encoding [arXiv 2024] Paper Code ProjectPage
NeRFFaceSpeech: One-shot Audio-diven 3D Talking Head Synthesis via Generative Prior [CVPR 2024 Workshop] Paper Code ProjectPage
Audio-Visual Speech Representation Expert for Enhanced Talking Face Video Generation and Evaluation [CVPR 2024 Workshop] Paper
EMOPortraits: Emotion-enhanced Multimodal One-shot Head Avatars [arXiv 2024] Paper ProjectPage
GSTalker: Real-time Audio-Driven Talking Face Generation via Deformable Gaussian Splatting [arXiv 2024] Paper
VASA-1: Lifelike Audio-Driven Talking Faces Generated in Real Time [arXiv 2024] Paper ProjectPage
THQA: A Perceptual Quality Assessment Database for Talking Heads [arXiv 2024] Paper Code
Talk3D: High-Fidelity Talking Portrait Synthesis via Personalized 3D Generative Prior [arXiv 2024] Paper Code ProjectPage
EDTalk: Efficient Disentanglement for Emotional Talking Head Synthesis [arXiv 2024] Paper Code ProjectPage
AniPortrait: Audio-Driven Synthesis of Photorealistic Portrait Animations [arXiv 2024] Paper Code
MoDiTalker: Motion-Disentangled Diffusion Model for High-Fidelity Talking Head Generation [arXiv 2024] Paper ProjectPage
Superior and Pragmatic Talking Face Generation with Teacher-Student Framework [arXiv 2024] Paper ProjectPage
X-Portrait: Expressive Portrait Animation with Hierarchical Motion Attention [arXiv 2024] Paper
Adaptive Super Resolution For One-Shot Talking-Head Generation [arXiv 2024] Paper
Style2Talker: High-Resolution Talking Head Generation with Emotion Style and Art Style [arXiv 2024] Paper
FlowVQTalker: High-Quality Emotional Talking Face Generation through Normalizing Flow and Quantization [arXiv 2024] Paper
FaceChain-ImagineID: Freely Crafting High-Fidelity Diverse Talking Faces from Disentangled Audio [arXiv 2024] Paper Code
Learning Dynamic Tetrahedra for High-Quality Talking Head Synthesis [CVPR 2024] Paper Code
EMO: Emote Portrait Alive - Generating Expressive Portrait Videos with Audio2Video Diffusion Model under Weak Conditions [arXiv 2024] Paper ProjectPage Code
G4G:A Generic Framework for High Fidelity Talking Face Generation with Fine-grained Intra-modal Alignment [arXiv 2024] Paper
Context-aware Talking Face Video Generation [arXiv 2024] Paper
EmoSpeaker: One-shot Fine-grained Emotion-Controlled Talking Face Generation [arXiv 2024] Paper ProjectPage Code
GPAvatar: Generalizable and Precise Head Avatar from Image(s) [ICLR 2024] Paper Code
Real3D-Portrait: One-shot Realistic 3D Talking Portrait Synthesis [ICLR 2024] Paper
EmoTalker: Emotionally Editable Talking Face Generation via Diffusion Model [ICASSP 2024] Paper
CVTHead: One-shot Controllable Head Avatar with Vertex-feature Transformer [WACV 2024] Paper Code

2023

VectorTalker: SVG Talking Face Generation with Progressive Vectorisation [arXiv 2023] Paper
DreamTalk: When Expressive Talking Head Generation Meets Diffusion Probabilistic Models [arXiv 2023] Paper ProjectPage
GMTalker: Gaussian Mixture based Emotional talking video Portraits [arXiv 2023] Paper ProjectPage
DiT-Head: High-Resolution Talking Head Synthesis using Diffusion Transformers [arXiv 2023] Paper
R2-Talker: Realistic Real-Time Talking Head Synthesis with Hash Grid Landmarks Encoding and Progressive Multilayer Conditioning [arXiv 2023] Paper
FT2TF: First-Person Statement Text-To-Talking Face Generation [arXiv 2023] Paper
VividTalk: One-Shot Audio-Driven Talking Head Generation Based on 3D Hybrid Prior [arXiv 2023] Paper Code ProjectPage
SyncTalk: The Devil is in the Synchronization for Talking Head Synthesis [arXiv 2023] Paper Code ProjectPage
GAIA: Zero-shot Talking Avatar Generation [arXiv 2023] Paper
Efficient Region-Aware Neural Radiance Fields for High-Fidelity Talking Portrait Synthesis [ICCV 2023] Paper ProjectPage Code
Implicit Identity Representation Conditioned Memory Compensation Network for Talking Head Video Generation [ICCV 2023] Paper ProjectPage Code
MODA: Mapping-Once Audio-driven Portrait Animation with Dual Attentions [ICCV 2023] Paper ProjectPage
ToonTalker: Cross-Domain Face Reenactment [ICCV 2023] Paper
Efficient Emotional Adaptation for Audio-Driven Talking-Head Generation [ICCV 2023] Paper ProjectPage Code
EMMN: Emotional Motion Memory Network for Audio-driven Emotional Talking Face Generation [ICCV 2023] Paper
Emotional Listener Portrait: Realistic Listener Motion Simulation in Conversation [ICCV 2023] Paper
Instruct-NeuralTalker: Editing Audio-Driven Talking Radiance Fields with Instructions [arXiv 2023] Paper
Plug the Leaks: Advancing Audio-driven Talking Face Generation by Preventing Unintended Information Flow [arXiv 2023] Paper
Reprogramming Audio-driven Talking Face Synthesis into Text-driven [arXiv 2023] Paper
Audio-Driven Dubbing for User Generated Contents via Style-Aware Semi-Parametric Synthesis [TCSVT 2023] Paper
Emotional Talking Head Generation based on Memory-Sharing and Attention-Augmented Networks [arXiv 2023] Paper
Ada-TTA: Towards Adaptive High-Quality Text-to-Talking Avatar Synthesis [arXiv 2023] Paper
SadTalker: Learning Realistic 3D Motion Coefficients for Stylized Audio-Driven Single Image Talking Face Animation [CVPR 2023] Paper Code
MetaPortrait: Identity-Preserving Talking Head Generation with Fast Personalized Adaptation [CVPR 2023] Paper ProjectPage Code
Implicit Neural Head Synthesis via Controllable Local Deformation Fields [CVPR 2023] Paper
LipFormer: High-fidelity and Generalizable Talking Face Generation with A Pre-learned Facial Codebook [CVPR 2023] Paper
GANHead: Towards Generative Animatable Neural Head Avatars [CVPR 2023] Paper ProjectPage Code
Parametric Implicit Face Representation for Audio-Driven Facial Reenactment [CVPR 2023] Paper
Identity-Preserving Talking Face Generation with Landmark and Appearance Priors [CVPR 2023] Paper Code
StyleSync: High-Fidelity Generalized and Personalized Lip Sync in Style-based Generator [CVPR 2023] Paper ProjectPage Code
Avatar Fingerprinting for Authorized Use of Synthetic Talking-Head Videos [arXiv 2023] Paper ProjectPage
Multimodal-driven Talking Face Generation, Face Swapping, Diffusion Model [arXiv 2023] Paper
High-fidelity Generalized Emotional Talking Face Generation with Multi-modal Emotion Space Learning [CVPR 2023] Paper
StyleLipSync: Style-based Personalized Lip-sync Video Generation [arXiv 2023] Paper ProjectPage Code
GeneFace++: Generalized and Stable Real-Time Audio-Driven 3D Talking Face Generation [arXiv 2023] Paper ProjectPage
High-Fidelity and Freely Controllable Talking Head Video Generation [CVPR 2023] Paper Project Page
One-Shot High-Fidelity Talking-Head Synthesis with Deformable Neural Radiance Field [CVPR 2023] Paper ProjectPage
Seeing What You Said: Talking Face Generation Guided by a Lip Reading Expert [CVPR 2023] Paper Code
Audio-Driven Talking Face Generation with Diverse yet Realistic Facial Animations [arXiv 2023] Paper
That's What I Said: Fully-Controllable Talking Face Generation [arXiv 2023] Paper ProjectPage
Emotionally Enhanced Talking Face Generation [arXiv 2023] Paper Code ProjectPage
A Unified Compression Framework for Efficient Speech-Driven Talking-Face Generation [MLSys Workshop 2023] Paper
TalkCLIP: Talking Head Generation with Text-Guided Expressive Speaking Styles [arXiv 2023] Paper
FONT: Flow-guided One-shot Talking Head Generation with Natural Head Motions [ICME 2023] Paper
DAE-Talker: High Fidelity Speech-Driven Talking Face Generation with Diffusion Autoencoder [arXiv 2023] Paper ProjectPage
OPT: ONE-SHOT POSE-CONTROLLABLE TALKING HEAD GENERATION [ICASSP 2023] Paper
DisCoHead: Audio-and-Video-Driven Talking Head Generation by Disentangled Control of Head Pose and Facial Expressions [ICASSP 2023] Paper Code ProjectPage
GeneFace: Generalized and High-Fidelity Audio-Driven 3D Talking Face Synthesis [ICLR 2023] Paper Code ProjectPage
OTAvatar : One-shot Talking Face Avatar with Controllable Tri-plane Rendering [CVPR 2023] Paper Code
Emotionally Enhanced Talking Face Generation [arXiv 2023] Paper Code ProjectPage
Style Transfer for 2D Talking Head Animation [arXiv 2023] Paper
READ Avatars: Realistic Emotion-controllable Audio Driven Avatars [arXiv 2023] Paper
On the Audio-visual Synchronization for Lip-to-Speech Synthesis [arXiv 2023] Paper
DiffTalk: Crafting Diffusion Models for Generalized Talking Head Synthesis [CVPR 2023] Paper
Diffused Heads: Diffusion Models Beat GANs on Talking-Face Generation [arXiv 2023] Paper ProjectPage
StyleTalk: One-shot Talking Head Generation with Controllable Speaking Styles [AAAI 2023] Paper Code
Audio-Visual Face Reenactment [WACV 2023] Paper ProjectPage Code

2022

Memories are One-to-Many Mapping Alleviators in Talking Face Generation [arXiv 2022] Paper ProjectPage
Masked Lip-Sync Prediction by Audio-Visual Contextual Exploitation in Transformers [SIGGRAPH Asia 2022] Paper
Talking Head Generation with Probabilistic Audio-to-Visual Diffusion Priors [arXiv 2022] Paper ProjectPage
Progressive Disentangled Representation Learning for Fine-Grained Controllable Talking Head Synthesis [CVPR 2022] Paper ProjectPage
SPACE: Speech-driven Portrait Animation with Controllable Expression [arXiv 2022] Paper ProjectPage
Compressing Video Calls using Synthetic Talking Heads [BMVC 2022] Paper Project Page
Synthesizing Photorealistic Virtual Humans Through Cross-modal Disentanglement [arXiv 2022] Paper
StyleTalker: One-shot Style-based Audio-driven Talking Head Video Generation [arXiv 2022] Paper
Free-HeadGAN: Neural Talking Head Synthesis with Explicit Gaze Control [arXiv 2022] Paper
EAMM: One-Shot Emotional Talking Face via Audio-Based Emotion-Aware Motion Model [SIGGRAPH 2022] Paper
Talking Head from Speech Audio using a Pre-trained Image Generator [ACM MM 2022] Paper
Latent Image Animator: Learning to Animate Images via Latent Space Navigation [ICLR 2022] Paper ProjectPage(note this page has auto-play music...) Code
Real-time Neural Radiance Talking Portrait Synthesis via Audio-spatial Decomposition [arXiv 2022] Paper ProjectPage Code
Learning Dynamic Facial Radiance Fields for Few-Shot Talking Head Synthesis [ECCV 2022] Paper ProjectPage Code
Semantic-Aware Implicit Neural Audio-Driven Video Portrait Generation [ECCV 2022] Paper ProjectPage Code
Text2Video: Text-driven Talking-head Video Synthesis with Phonetic Dictionary [ICASSP 2022] Paper ProjectPage Code
StableFace: Analyzing and Improving Motion Stability for Talking Face Generation [arXiv 2022] Paper ProjectPage
Emotion-Controllable Generalized Talking Face Generation [IJCAI 2022] Paper
StyleHEAT: One-Shot High-Resolution Editable Talking Face Generation via Pre-trained StyleGAN [arXiv 2022] Paper Code ProjectPage
DFA-NeRF: Personalized Talking Head Generation via Disentangled Face Attributes Neural Rendering [arXiv 2022] Paper
Dynamic Neural Textures: Generating Talking-Face Videos with Continuously Controllable Expressions [arXiv 2022] Paper
Audio-Driven Talking Face Video Generation with Dynamic Convolution Kernels [TMM 2022] Paper
Depth-Aware Generative Adversarial Network for Talking Head Video Generation [CVPR 2022] Paper ProjectPage Code
Show Me What and Tell Me How: Video Synthesis via Multimodal Conditioning [CVPR 2022] Paper Code ProjectPage
Depth-Aware Generative Adversarial Network for Talking Head Video Generation [CVPR 2022] Paper Code ProjectPage
Expressive Talking Head Generation with Granular Audio-Visual Control [CVPR 2022] Paper
Talking Face Generation with Multilingual TTS [CVPR 2022 Demo] Paper DemoPage
SyncTalkFace: Talking Face Generation with Precise Lip-syncing via Audio-Lip Memory [AAAI 2022] Paper

2021

Live Speech Portraits: Real-Time Photorealistic Talking-Head Animation [SIGGRAPH Asia 2021] Paper Code
Imitating Arbitrary Talking Style for Realistic Audio-Driven Talking Face Synthesis [ACMMM 2021] Paper Code
AD-NeRF: Audio Driven Neural Radiance Fields for Talking Head Synthesis [ICCV 2021] Paper Code
FACIAL: Synthesizing Dynamic Talking Face with Implicit Attribute Learning [ICCV 2021] Paper Code
Learned Spatial Representations for Few-shot Talking-Head Synthesis [ICCV 2021] Paper
Pose-Controllable Talking Face Generation by Implicitly Modularized Audio-Visual Representation [CVPR 2021] Paper Code ProjectPage
One-Shot Free-View Neural Talking-Head Synthesis for Video Conferencing [CVPR 2021] Paper
Audio-Driven Emotional Video Portraits [CVPR 2021] Paper Code
AnyoneNet: Synchronized Speech and Talking Head Generation for Arbitrary Person [arXiv 2021] Paper
Talking Head Generation with Audio and Speech Related Facial Action Units [BMVC 2021] Paper
Audio2Head: Audio-driven One-shot Talking-head Generation with Natural Head Motion [IJCAI 2021] Paper
Write-a-speaker: Text-based Emotional and Rhythmic Talking-head Generation [AAAI 2021] Paper
Text2Video: Text-driven Talking-head Video Synthesis with Phonetic Dictionary [arXiv 2021] Paper Code

2020

Audio-driven Talking Face Video Generation with Learning-based Personalized Head Pose [arXiv 2020] Paper Code
A Lip Sync Expert Is All You Need for Speech to Lip Generation In the Wild [ACMMM 2020] Paper Code
Talking Face Generation with Expression-Tailored Generative Adversarial Network [ACMMM 2020] Paper
Speech Driven Talking Face Generation from a Single Image and an Emotion Condition [arXiv 2020] Paper Code
A Neural Lip-Sync Framework for Synthesizing Photorealistic Virtual News Anchors [ICPR 2020] Paper
Everybody's Talkin': Let Me Talk as You Want [arXiv 2020] Paper
HeadGAN: Video-and-Audio-Driven Talking Head Synthesis [arXiv 2020] Paper
Talking-head Generation with Rhythmic Head Motion [ECCV 2020] Paper
Neural Voice Puppetry: Audio-driven Facial Reenactment [ECCV 2020] Paper Project Code
Learning Individual Speaking Styles for Accurate Lip to Speech Synthesis [CVPR 2020] Paper
Robust One Shot Audio to Video Generation [CVPRW 2020] Paper
MakeItTalk: Speaker-Aware Talking Head Animation [SIGGRAPH Asia 2020] Paper Code
FLNet: Landmark Driven Fetching and Learning Network for Faithful Talking Facial Animation Synthesis. [AAAI 2020] Paper
Realistic Face Reenactment via Self-Supervised Disentangling of Identity and Pose [AAAI 2020] Paper
Photorealistic Lip Sync with Adversarial Temporal Convolutional [arXiv 2020] Paper
SPEECH-DRIVEN FACIAL ANIMATION USING POLYNOMIAL FUSION OF FEATURES [arXiv 2020] Paper
Animating Face using Disentangled Audio Representations [WACV 2020] Paper

Before 2020

Realistic Speech-Driven Facial Animation with GANs. [IJCV 2019] Paper PorjectPage
Few-Shot Adversarial Learning of Realistic Neural Talking Head Models [ICCV 2019] Paper Code
Hierarchical Cross-Modal Talking Face Generation with Dynamic Pixel-Wise Loss [CVPR 2019] Paper Code
Talking Face Generation by Adversarially Disentangled Audio-Visual Representation [AAAI 2019] Paper Code ProjectPage
Lip Movements Generation at a Glance [ECCV 2018] Paper
X2Face: A network for controlling face generation using images, audio, and pose codes [ECCV 2018] Paper Code ProjectPage
Talking Face Generation by Conditional Recurrent Adversarial Network [IJCAI 2019] Paper Code
Speech-Driven Facial Reenactment Using Conditional Generative Adversarial Networks [arXiv 2018] Paper
High-Resolution Talking Face Generation via Mutual Information Approximation [arXiv 2018] Paper
Generative Adversarial Talking Head: Bringing Portraits to Life with a Weakly Supervised Neural Network [arXiv 2018] Paper
You said that? [BMVC 2017] Paper

2D Video - Person dependent

Continuously Controllable Facial Expression Editing in Talking Face Videos [TAFFC 2023] Paper Project Page
Synthesizing Obama: Learning Lip Sync from Audio [SIGGRAPH 2017] Paper Project Page
PHOTOREALISTIC ADAPTATION AND INTERPOLATION OF FACIAL EXPRESSIONS USING HMMS AND AAMS FOR AUDIO-VISUAL SPEECH SYNTHESIS [ICIP 2017] Paper
HMM-Based Photo-Realistic Talking Face Synthesis Using Facial Expression Parameter Mapping with Deep Neural Networks [Journal of Computer and Communications2017] Paper
ObamaNet: Photo-realistic lip-sync from text [arXiv 2017] Paper
A deep bidirectional LSTM approach for video-realistic talking head [Multimedia Tools Appl 2015] Paper
Photo-Realistic Expressive Text to Talking Head Synthesis [Interspeech 2013] Paper
PHOTO-REAL TALKING HEAD WITH DEEP BIDIRECTIONAL LSTM [ICASSP 2015] Paper
Expressive Speech-Driven Facial Animation [TOG 2005] Paper

3D Animation

Joint Co-Speech Gesture and Expressive Talking Face Generation using Diffusion with Adapters [arXiv 2024] Paper
One Shot, One Talk: Whole-body Talking Avatar from a Single Image [arXiv 2024] Paper Project
Stereo-Talker: Audio-driven 3D Human Synthesis with Prior-Guided Mixture-of-Experts [arXiv 2024] Paper
Pose-Aware 3D Talking Face Synthesis using Geometry-guided Audio-Vertices Attention [TVCG 2024] Paper Code
MimicTalk: Mimicking a personalized and expressive 3D talking face in few minutes [NeurIPS 2024] Paper Code ProjectPage
ScanTalk: 3D Talking Heads from Unregistered Scans [ECCV 2024] Paper Code
Audio-Driven Emotional 3D Talking-Head Generation [arXiv 2024] Paper
Beyond Fixed Topologies: Unregistered Training and Comprehensive Evaluation Metrics for 3D Talking Heads [arXiv 2024] Paper
3DFacePolicy: Speech-Driven 3D Facial Animation with Diffusion Policy [arxiv 2024] Paper
ProbTalk3D: Non-Deterministic Emotion Controllable Speech-Driven 3D Facial Animation Synthesis Using VQ-VAE [arXiv 2024] Paper Code
KMTalk: Speech-Driven 3D Facial Animation with Key Motion Embedding [ECCV 2024] Paper Code
EmoFace: Emotion-Content Disentangled Speech-Driven 3D Talking Face with Mesh Attention [arXiv 2024] Paper
DEEPTalk: Dynamic Emotion Embedding for Probabilistic Speech-Driven 3D Face Animation [arXiv 2024] Paper
JambaTalk: Speech-Driven 3D Talking Head Generation Based on Hybrid Transformer-Mamba Model [arXiv 2024] Paper
GLDiTalker: Speech-Driven 3D Facial Animation with Graph Latent Diffusion Transformer [arXiv 2024] Paper
UniTalker: Scaling up Audio-Driven 3D Facial Animation through A Unified Model [arXiv 2024] Paper
EmoTalk3D: High-Fidelity Free-View Synthesis of Emotional 3D Talking Head [arXiv 2024] Paper
EmoFace: Audio-driven Emotional 3D Face Animation [arXiv 2024] Paper Code
MultiTalk: Enhancing 3D Talking Head Generation Across Languages with Multilingual Video Dataset [InterSpeed 2024] Paper ProjectPage
3D Gaussian Blendshapes for Head Avatar Animation [SIGGRAPH 2024] Paper
CSTalk: Correlation Supervised Speech-driven 3D Emotional Facial Animation Generation [arXiv 2024] Paper
GaussianTalker: Speaker-specific Talking Head Synthesis via 3D Gaussian Splatting [arXiv 2024] Paper
Learn2Talk: 3D Talking Face Learns from 2D Talking Face [arXiv 2024] Paper ProjectPage
Beyond Talking -- Generating Holistic 3D Human Dyadic Motion for Communication [arXiv 2024] Paper
AnimateMe: 4D Facial Expressions via Diffusion Models [arXiv 2024] Paper
EmoVOCA: Speech-Driven Emotional 3D Talking Heads [arXiv 2024] Paper
FaceTalk: Audio-Driven Motion Diffusion for Neural Parametric Head Models [CVPR 2024] Paper Code ProjectPage
AVI-Talking: Learning Audio-Visual Instructions for Expressive 3D Talking Face Generation [arXiv 2024] Paper
DiffSpeaker: Speech-Driven 3D Facial Animation with Diffusion Transformer [arXiv 2024] Paper Code
Media2Face: Co-speech Facial Animation Generation With Multi-Modality Guidance [arXiv 2024] Paper ProjectPage
EMOTE: Emotional Speech-Driven Animation with Content-Emotion Disentanglement [SIGGRAPH Asia 2023] Paper ProjectPage
PMMTalk: Speech-Driven 3D Facial Animation from Complementary Pseudo Multi-modal Features [arXiv] Paper
3DiFACE: Diffusion-based Speech-driven 3D Facial Animation and Editing [arXiv 2023] Paper Code ProjectPage
Probabilistic Speech-Driven 3D Facial Motion Synthesis: New Benchmarks, Methods, and Applications [arXiv 2023] Paper
DiffusionTalker: Personalization and Acceleration for Speech-Driven 3D Face Diffuser [arXiv 2023] Paper
DiffPoseTalk: Speech-Driven Stylistic 3D Facial Animation and Head Pose Generation via Diffusion Models [arXiv 2023] Paper ProjectPage Code
Imitator: Personalized Speech-driven 3D Facial Animation [ICCV 2023] Paper ProjectPage Code
Speech4Mesh: Speech-Assisted Monocular 3D Facial Reconstruction for Speech-Driven 3D Facial Animation [ICCV 2023] Paper
Semi-supervised Speech-driven 3D Facial Animation via Cross-modal Encoding [ICCV 2023] Paper
Audio-Driven 3D Facial Animation from In-the-Wild Videos [arXiv 2023] Paper ProjectPage
EmoTalk: Speech-driven emotional disentanglement for 3D face animation [ICCV 2023] Paper ProjectPage
FaceXHuBERT: Text-less Speech-driven E(X)pressive 3D Facial Animation Synthesis Using Self-Supervised Speech Representation Learning [arXiv 2023] Paper Code ProjectPage
Pose-Controllable 3D Facial Animation Synthesis using Hierarchical Audio-Vertices Attention [arXiv 2023] Paper
Learning Audio-Driven Viseme Dynamics for 3D Face Animation [arXiv 2023] Paper ProjectPage
CodeTalker: Speech-Driven 3D Facial Animation with Discrete Motion Prior [CVPR 2023] Paper ProjectPage
Expressive Speech-driven Facial Animation with controllable emotions [arXiv 2023] Paper
Imitator: Personalized Speech-driven 3D Facial Animation [arXiv 2022] Paper ProjectPage
PV3D: A 3D Generative Model for Portrait Video Generation [arXiv 2022] Paper ProjectPage
Neural Emotion Director: Speech-preserving semantic control of facial expressions in “in-the-wild” videos [CVPR 2022] Paper Code
FaceFormer: Speech-Driven 3D Facial Animation with Transformers [CVPR 2022] Paper Code ProjectPage
LipSync3D: Data-Efficient Learning of Personalized 3D Talking Faces from Video using Pose and Lighting Normalization [CVPR 2021] Paper
MeshTalk: 3D Face Animation from Speech using Cross-Modality Disentanglement [ICCV 2021] Paper
AD-NeRF: Audio Driven Neural Radiance Fields for Talking Head Synthesis [ICCV 2021] Paper Code
3D-TalkEmo: Learning to Synthesize 3D Emotional Talking Head [arXiv 2021] Paper
Modality Dropout for Improved Performance-driven Talking Faces [ICMI 2020] Paper
Audio- and Gaze-driven Facial Animation of Codec Avatars [arXiv 2020] Paper
Capture, Learning, and Synthesis of 3D Speaking Styles [CVPR 2019] Paper
VisemeNet: Audio-Driven Animator-Centric Speech Animation [TOG 2018] Paper
Speech-Driven Expressive Talking Lips with Conditional Sequential Generative Adversarial Networks [TAC 2018] Paper
End-to-end Learning for 3D Facial Animation from Speech [ICMI 2018] Paper
Visual Speech Emotion Conversion using Deep Learning for 3D Talking Head [MMAC 2018]
A Deep Learning Approach for Generalized Speech Animation [SIGGRAPH 2017] Paper
Audio-Driven Facial Animation by Joint End-to-End Learning of Pose and Emotion [TOG 2017] Paper
Speech-driven 3D Facial Animation with Implicit Emotional Awareness A Deep Learning Approach [CVPR 2017]
Expressive Speech Driven Talking Avatar Synthesis with DBLSTM using Limited Amount of Emotional Bimodal Data [Interspeech 2016] Paper
Real-Time Speech-Driven Face Animation With Expressions Using Neural Networks [TONN 2012] Paper
Facial Expression Synthesis Based on Emotion Dimensions for Affective Talking Avatar [SIST 2010] Paper

Datasets & Benchmark

Responsive Listening Head Generation: A Benchmark Dataset and Baseline [ECCV 2022] Paper ProjectPage
TalkingHead-1KH Link
MEAD: A Large-scale Audio-visual Dataset for Emotional Talking-face Generation [ECCV 2020] ProjectPage
VoxCeleb Link
LRW Link
LRS2 Link
GRID Link
CREMA-D Link
MMFace4D Link
DPCD Link Paper

Survey

A Comprehensive Taxonomy and Analysis of Talking Head Synthesis: Techniques for Portrait Generation, Driving Mechanisms, and Editing [arXiv 2024] Paper
From Pixels to Portraits: A Comprehensive Survey of Talking Head Generation Techniques and Applications [arXiv 2023] Paper
Deep Learning for Visual Speech Analysis: A Survey [arXiv 2022] Paper
What comprises a good talking-head video generation?: A Survey and Benchmark [arXiv 2020] Paper

Colabs

Avatars4All: https://github.com/eyaler/avatars4all

Name		Name	Last commit message	Last commit date
Latest commit History 64 Commits
LICENSE		LICENSE
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Awesome Talking Face

🔆 This project is still on-going, pull requests are welcomed!!

⭐ If you find this repo useful, please star it!!!

2022.09 Update!

2021.11 Update!

TO DO LIST

Papers

2D Video - Person independent

2024

2023

2022

2021

2020

Before 2020

2D Video - Person dependent

3D Animation

Datasets & Benchmark

Survey

Colabs

About

Releases

Packages

Contributors 10

License

JosephPai/Awesome-Talking-Face

Folders and files

Latest commit

History

Repository files navigation

Awesome Talking Face

🔆 This project is still on-going, pull requests are welcomed!!

⭐ If you find this repo useful, please star it!!!

2022.09 Update!

2021.11 Update!

TO DO LIST

Papers

2D Video - Person independent

2024

2023

2022

2021

2020

Before 2020

2D Video - Person dependent

3D Animation

Datasets & Benchmark

Survey

Colabs

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Contributors 10

Packages