Stars
This repository is the official implementation of "DisPose: Disentangling Pose Guidance for Controllable Human Image Animation"
Official repo for "IDArb: Intrinsic Decomposition for arbitrary number of input views and illuminations"
Code release for https://kovenyu.com/WonderWorld/
FreeSplatter: Pose-free Gaussian Splatting for Sparse-view 3D Reconstruction
Python APIs for web automation, testing, and bypassing bot-detection.
A framework to enable autonomous android and computer use using any LLM (local or remote)
deepbeepmeep / HunyuanVideoGP
Forked from Tencent/HunyuanVideoHunyuanVideo GP: Large Video Generation Model - GPU Poor version
Open source Claude Artifacts – built with Llama 3.1 405B
GAF: Gaussian Avatar Reconstruction from Monocular Videos via Multi-view Diffusion
An image viewer and AI-assisted editing tool that helps with curating datasets for generative AI models, finetunes and LoRA.
A minimalistic AI-powered search engine that helps you find information on the internet. Powered by Vercel AI SDK! Search with models like GPT-4o mini, GPT-4o and Claude 3.5 Sonnet(New)!
Learning Flow Fields in Attention for Controllable Person Image Generation
RooVetGit / Roo-Cline
Forked from cline/clineAutonomous coding agent right in your IDE, capable of creating/editing files, executing commands, using the browser, and more with your permission every step of the way.
Official implementation of the paper: "FlowEdit: Inversion-Free Text-Based Editing Using Pre-Trained Flow Models"
A pipeline parallel training script for diffusion models.
Finetune Llama 3.3, Mistral, Phi, Qwen 2.5 & Gemma LLMs 2-5x faster with 70% less memory
[NeurIPS 2024] "Sparse-view Pose Estimation and Reconstruction via Analysis by Generative Synthesis" official implementation.
NitroFusion: High-Fidelity Single-Step Diffusion through Dynamic Adversarial Training
InspireMusic: A Unified Framework for Music, Song, Audio Generation.
We present StableAnimator, the first end-to-end ID-preserving video diffusion framework, which synthesizes high-quality videos without any post-processing, conditioned on a reference image and a se…
Official Implementation for paper: Negative Token Merging: Image-based Adversarial Feature Guidance
Custom Conditioning Delta (ConDelta) nodes for ComfyUI
MoGe: Unlocking Accurate Monocular Geometry Estimation for Open-Domain Images with Optimal Training Supervision
ComfyUI wrapper of catvton-flux
CatVTON is a simple and efficient virtual try-on diffusion model with 1) Lightweight Network (899.06M parameters totally), 2) Parameter-Efficient Training (49.57M parameters trainable) and 3) Simpl…
🔥 Open Source Browser API for AI Agents & Apps. Steel Browser is a batteries-included browser instance that lets you automate the web without worrying about infrastructure.
A minimal and universal controller for FLUX.1.