open-source multimodal large language model that can hear, talk while thinking. Featuring real-time end-to-end speech input and streaming audio output conversational capabilities.

Python 3,245 293 Updated Nov 5, 2024

continue-revolution / sd-webui-animatediff

AnimateDiff for AUTOMATIC1111 Stable Diffusion WebUI

Python 3,151 264 Updated Sep 22, 2024

Kosinkadink / ComfyUI-AnimateDiff-Evolved

Improved AnimateDiff for ComfyUI and Advanced Sampling Support

Python 2,877 214 Updated Jan 5, 2025

Stability-AI / stable-audio-tools

Generative models for conditional audio generation

Python 2,814 271 Updated Dec 27, 2024

ictnlp / LLaMA-Omni

LLaMA-Omni is a low-latency and high-quality end-to-end speech interaction model built upon Llama-3.1-8B-Instruct, aiming to achieve speech capabilities at the GPT-4o level.

Python 2,713 185 Updated Nov 14, 2024

MrForExample / ComfyUI-3D-Pack

An extensive node suite that enables ComfyUI to process 3D inputs (Mesh & UV Texture, etc) using cutting edge algorithms (3DGS, NeRF, etc.)

Python 2,549 262 Updated Dec 18, 2024

prs-eth / Marigold

[CVPR 2024 - Oral, Best Paper Award Candidate] Marigold: Repurposing Diffusion-Based Image Generators for Monocular Depth Estimation

Python 2,487 144 Updated Dec 14, 2024

elevenlabs / elevenlabs-python

The official Python API for ElevenLabs Text to Speech.

Python 2,317 273 Updated Dec 18, 2024

autonomousvision / sdfstudio

A Unified Framework for Surface Reconstruction

Python 1,999 190 Updated Jul 11, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Blizaine

Achievements

Achievements

Block or report Blizaine

Lists (1)

🔮 Future ideas

Stars

xai-org / grok-1

geekan / MetaGPT

hacksider / Deep-Live-Cam

2noise / ChatTTS

hpcaitech / Open-Sora

w-okada / voice-changer

KwaiVGI / LivePortrait

bmaltais / kohya_ss

nerfstudio-project / nerfstudio

OthersideAI / self-operating-computer

voicepaw / so-vits-svc-fork

apple / ml-ferret

brycedrennan / imaginAIry

netease-youdao / EmotiVoice

LiheYoung / Depth-Anything

Tencent / HunyuanVideo

OpenTalker / video-retalking

fudan-generative-vision / champ

livekit / agents

fudan-generative-vision / hallo2

dreamgaussian / dreamgaussian

gpt-omni / mini-omni