Skip to content
View Blizaine's full-sized avatar

Block or report Blizaine

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Please don't include any personal information such as legal names or email addresses. Maximum 100 characters, markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
Showing results

HunyuanVideo: A Systematic Framework For Large Video Generation Model

Python 6,877 520 Updated Dec 31, 2024

Hallo2: Long-Duration and High-Resolution Audio-driven Portrait Image Animation

Python 4,468 639 Updated Dec 13, 2024

リアルタイムボイスチェンジャー Realtime Voice Changer

Python 48 7 Updated Dec 7, 2024

open-source multimodal large language model that can hear, talk while thinking. Featuring real-time end-to-end speech input and streaming audio output conversational capabilities.

Python 3,236 291 Updated Nov 5, 2024

LLaMA-Omni is a low-latency and high-quality end-to-end speech interaction model built upon Llama-3.1-8B-Instruct, aiming to achieve speech capabilities at the GPT-4o level.

Python 2,695 185 Updated Nov 14, 2024

DepthCrafter: Generating Consistent Long Depth Sequences for Open-world Videos

Python 1,083 56 Updated Dec 10, 2024

MLLM for On-Demand Spatial-Temporal Understanding at Arbitrary Resolution

Python 294 14 Updated Dec 25, 2024

Official implementation of "MIMO: Controllable Character Video Synthesis with Spatial Decomposed Modeling"

1,384 55 Updated Nov 26, 2024

Build real-time multimodal AI applications 🤖🎙️📹

Python 4,454 514 Updated Dec 31, 2024

ChatTTS is a generative speech model for daily dialogue.

Python 20 Updated Nov 5, 2024

Add voice to your ollama model. Supports real-time speech generation and streaming output from your LLM.

Go 7 1 Updated Mar 13, 2024

An Open Source text-to-speech system built by inverting Whisper.

Jupyter Notebook 4,061 223 Updated Dec 12, 2024

EmotiVoice 😊: a Multi-Voice and Prompt-Controlled TTS Engine

Python 7,565 645 Updated Aug 13, 2024

A framework to enable multimodal models to operate a computer.

Python 9,017 1,217 Updated Dec 19, 2024

[ECCV'24] Kalman-Inspired Feature Propagation for Video Face Super-Resolution

Python 327 15 Updated Aug 29, 2024

An OpenAI API compatible text to speech server using Coqui AI's xtts_v2 and/or piper tts as the backend.

Python 570 82 Updated Aug 22, 2024

A bot that likes comments on Tiktok videos.

Python 28 4 Updated Sep 27, 2021

The repository provides code for running inference with the Meta Segment Anything Model 2 (SAM 2), links for downloading the trained model checkpoints, and example notebooks that show how to use th…

Jupyter Notebook 13,389 1,288 Updated Dec 25, 2024

Video2Video Framework for ComfyUI

Python 62 3 Updated Aug 12, 2024

real time face swap and one-click video deepfake with only a single image

Python 41,980 6,163 Updated Dec 30, 2024

A generative speech model for daily dialogue.

Python 33,300 3,617 Updated Dec 3, 2024

Bring portraits to life via webcam!

Python 118 10 Updated Jul 17, 2024

Bring portraits to life via Monitor!

Python 261 37 Updated Aug 12, 2024

Bring portraits to life!

Python 48 6 Updated Oct 31, 2024

Bring portraits to life!

Python 13,472 1,443 Updated Nov 12, 2024

Bring portraits to life!

Python 22 1 Updated Jul 13, 2024

Official implementation of MotionClone: Training-Free Motion Cloning for Controllable Video Generation

Python 419 31 Updated Oct 16, 2024

🤖 The free, Open Source alternative to OpenAI, Claude and others. Self-hosted and local-first. Drop-in replacement for OpenAI, running on consumer-grade hardware. No GPU required. Runs gguf, transf…

Go 27,415 2,055 Updated Dec 31, 2024
Next