Skip to content
View Blizaine's full-sized avatar

Block or report Blizaine

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Please don't include any personal information such as legal names or email addresses. Maximum 100 characters, markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this userโ€™s behavior. Learn more about reporting abuse.

Report abuse
66 stars written in Python
Clear filter

Grok open release

Python 49,754 8,343 Updated Aug 30, 2024

๐ŸŒŸ The Multi-Agent Framework: First AI Software Company, Towards Natural Language Programming

Python 46,407 5,518 Updated Dec 18, 2024

real time face swap and one-click video deepfake with only a single image

Python 42,086 6,178 Updated Jan 5, 2025

A generative speech model for daily dialogue.

Python 33,407 3,631 Updated Dec 3, 2024

Open-Sora: Democratizing Efficient Video Production for All

Python 22,978 2,258 Updated Dec 27, 2024

ใƒชใ‚ขใƒซใ‚ฟใ‚คใƒ ใƒœใ‚คใ‚นใƒใ‚งใƒณใ‚ธใƒฃใƒผ Realtime Voice Changer

Python 16,866 1,836 Updated Nov 14, 2024

Bring portraits to life!

Python 13,520 1,449 Updated Jan 1, 2025
Python 9,893 1,271 Updated Jan 3, 2025

A collaboration friendly studio for NeRFs

Python 9,725 1,334 Updated Jan 3, 2025

A framework to enable multimodal models to operate a computer.

Python 9,045 1,218 Updated Dec 19, 2024

so-vits-svc fork with realtime support, improved interface and more features.

Python 8,841 1,183 Updated Dec 23, 2024
Python 8,532 506 Updated Oct 9, 2024

Pythonic AI generation of images and videos

Python 8,024 450 Updated Sep 22, 2024

EmotiVoice ๐Ÿ˜Š: a Multi-Voice and Prompt-Controlled TTS Engine

Python 7,574 645 Updated Aug 13, 2024

[CVPR 2024] Depth Anything: Unleashing the Power of Large-Scale Unlabeled Data. Foundation Model for Monocular Depth Estimation

Python 7,176 550 Updated Jul 17, 2024

HunyuanVideo: A Systematic Framework For Large Video Generation Model

Python 7,059 532 Updated Jan 2, 2025

[SIGGRAPH Asia 2022] VideoReTalking: Audio-based Lip Synchronization for Talking Head Video Editing In the Wild

Python 6,795 995 Updated Aug 5, 2024

Champ: Controllable and Consistent Human Image Animation with 3D Parametric Guidance

Python 4,797 601 Updated Jul 10, 2024

Build real-time multimodal AI applications ๐Ÿค–๐ŸŽ™๏ธ๐Ÿ“น

Python 4,513 523 Updated Jan 4, 2025

Hallo2: Long-Duration and High-Resolution Audio-driven Portrait Image Animation

Python 4,481 642 Updated Dec 13, 2024

[ICLR 2024 Oral] Generative Gaussian Splatting for Efficient 3D Content Creation

Python 4,019 360 Updated Jan 2, 2024

open-source multimodal large language model that can hear, talk while thinking. Featuring real-time end-to-end speech input and streaming audio output conversational capabilities.

Python 3,245 293 Updated Nov 5, 2024

AnimateDiff for AUTOMATIC1111 Stable Diffusion WebUI

Python 3,151 264 Updated Sep 22, 2024

Improved AnimateDiff for ComfyUI and Advanced Sampling Support

Python 2,877 214 Updated Jan 5, 2025

Generative models for conditional audio generation

Python 2,814 271 Updated Dec 27, 2024

LLaMA-Omni is a low-latency and high-quality end-to-end speech interaction model built upon Llama-3.1-8B-Instruct, aiming to achieve speech capabilities at the GPT-4o level.

Python 2,713 185 Updated Nov 14, 2024

An extensive node suite that enables ComfyUI to process 3D inputs (Mesh & UV Texture, etc) using cutting edge algorithms (3DGS, NeRF, etc.)

Python 2,549 262 Updated Dec 18, 2024

[CVPR 2024 - Oral, Best Paper Award Candidate] Marigold: Repurposing Diffusion-Based Image Generators for Monocular Depth Estimation

Python 2,487 144 Updated Dec 14, 2024

The official Python API for ElevenLabs Text to Speech.

Python 2,317 273 Updated Dec 18, 2024

A Unified Framework for Surface Reconstruction

Python 1,999 190 Updated Jul 11, 2024
Next