Skip to content
View ZD-ai4x's full-sized avatar
💭
I may be slow to respond.
💭
I may be slow to respond.

Block or report ZD-ai4x

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Please don't include any personal information such as legal names or email addresses. Maximum 100 characters, markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
Stars

Avatar

53 repositories

Digital Avatar Conversational System - Linly-Talker. 😄✨ Linly-Talker is an intelligent AI system that combines large language models (LLMs) with visual models to create a novel human-AI interaction…

Python 2,317 383 Updated Jan 8, 2025

MimicTalk: Mimicking a personalized and expressive 3D talking face in minutes; NeurIPS 2024; Official code

Python 552 67 Updated Oct 16, 2024

Multi-lingual large voice generation model, providing inference, training and deployment full-stack ability.

Python 10,554 1,027 Updated Feb 11, 2025

Multilingual Voice Understanding Model

Python 4,414 390 Updated Jan 8, 2025

Interface for OuteTTS models.

Python 920 79 Updated Feb 14, 2025

Fast and accurate automatic speech recognition (ASR) for edge devices

Python 2,557 130 Updated Feb 4, 2025

Official code for "F5-TTS: A Fairytaler that Fakes Fluent and Faithful Speech with Flow Matching"

Python 9,613 1,290 Updated Feb 14, 2025

Open source inference code for Rev's model

Python 372 25 Updated Jan 17, 2025

SenseVoice-python: A enterprise-grade open source multi-language asr system from funasr opensource with onnxruntime

Python 83 12 Updated Sep 24, 2024

Talk to any LLM with hands-free voice interaction, voice interruption, and Live2D taking face running locally across platforms

Python 2,233 219 Updated Feb 14, 2025

实时语音交互数字人,支持端到端语音方案(GLM-4-Voice - THG)和级联方案(ASR-LLM-TTS-THG)。可自定义形象与音色,无须训练,支持音色克隆,首包延迟低至3s。Real-time voice interactive digital human, supporting end-to-end voice solutions (GLM-4-Voice - THG) and …

Python 673 96 Updated Nov 15, 2024

Diffusion-based Portrait and Animal Animation

Python 653 57 Updated Jan 13, 2025

EchoMimicV2: Towards Striking, Simplified, and Semi-Body Human Animation

Python 2,703 317 Updated Jan 27, 2025

This repository contains the codes of "A Lip Sync Expert Is All You Need for Speech to Lip Generation In the Wild", published at ACM Multimedia 2020. For HD commercial model, please try out Sync Labs

Python 11,334 2,381 Updated Feb 10, 2025

本项目基于SadTalkers实现视频唇形合成的Wav2lip。通过以视频文件方式进行语音驱动生成唇形,设置面部区域可配置的增强方式进行合成唇形(人脸)区域画面增强,提高生成唇形的清晰度。使用DAIN 插帧的DL算法对生成视频进行补帧,补充帧间合成唇形的动作过渡,使合成的唇形更为流畅、真实以及自然。

Python 1,928 333 Updated Jun 4, 2023

Real time interactive streaming digital human

Python 4,530 662 Updated Feb 7, 2025

一个超轻量级、可以在移动端实时运行的数字人模型

Python 1,510 222 Updated Nov 13, 2024

1 min voice data can also be used to train a good TTS model! (few shot voice cloning)

Python 40,449 4,520 Updated Feb 14, 2025

[LCLR 2025 Oral] TANGO: Co-Speech Gesture Video Reenactment with Hierarchical Audio-Motion Embedding and Diffusion Interpolation

Python 895 112 Updated Oct 29, 2024

Voice activity detector (VAD) for the browser with a simple API

TypeScript 1,080 167 Updated Jan 19, 2025

Fay is an open-source digital human framework integrating language models and digital characters. It offers retail, assistant, and agent versions for diverse applications like virtual shopping guid…

JavaScript 9,898 1,872 Updated Feb 13, 2025

MuseTalk: Real-Time High Quality Lip Synchorization with Latent Space Inpainting

Python 3,439 435 Updated Nov 27, 2024

[CVPR 2023] SadTalker:Learning Realistic 3D Motion Coefficients for Stylized Audio-Driven Single Image Talking Face Animation

Python 12,314 2,291 Updated Jun 26, 2024

📣 商用级开源语音自动识别程序库,开箱即用,全平台支持,中英文混合识别。A Cross-platform implementation of ASR inference. It's based on ONNXRuntime and FunASR. We provide a set of easier APIs to call ASR models.

C++ 525 62 Updated May 15, 2024

A Fundamental End-to-End Speech Recognition Toolkit and Open Source SOTA Pretrained Models, Supporting Speech Recognition, Voice Activity Detection, Text Post-processing etc.

Python 8,093 838 Updated Feb 13, 2025

每个人都能用的数字人

Python 984 207 Updated Feb 9, 2025

Awesome Digital Human

TypeScript 1,166 120 Updated Jan 8, 2025

AniPortrait: Audio-Driven Synthesis of Photorealistic Portrait Animation

Python 4,830 600 Updated Jul 2, 2024

GeneFace++: Generalized and Stable Real-Time 3D Talking Face Generation; Official Code

Python 1,642 240 Updated Oct 18, 2024

PersonaTalk Hack

Python 13 Updated Jan 10, 2025