Highlights
- Pro
Stars
Interact with your documents using the power of GPT, 100% privately, no data leaks
Official Code for DragGAN (SIGGRAPH 2023)
Open-Sora: Democratizing Efficient Video Production for All
Finetune Llama 3.3, Mistral, Phi, Qwen 2.5 & Gemma LLMs 2-5x faster with 70% less memory
This project aim to reproduce Sora (Open AI T2V model), we wish the open source community contribute to this project.
text and image to video generation: CogVideoX (2024) and CogVideo (ICLR 2023)
🐍 Geometric Computer Vision Library for Spatial AI
Simple, unified interface to multiple Generative AI providers
Official repo for paper "Structured 3D Latents for Scalable and Versatile 3D Generation".
[SIGGRAPH Asia 2024, Journal Track] ToonCrafter: Generative Cartoon Interpolation
Count the MACs / FLOPs of your PyTorch model.
VideoCrafter2: Overcoming Data Limitations for High-Quality Video Diffusion Models
[NeurIPS 2024] Depth Anything V2. A More Capable Foundation Model for Monocular Depth Estimation
Depth Pro: Sharp Monocular Metric Depth in Less Than a Second.
[WIP] Layer Diffusion for WebUI (via Forge)
Official repo for VGen: a holistic video generation ecosystem for video generation building on diffusion models
PixArt-α: Fast Training of Diffusion Transformer for Photorealistic Text-to-Image Synthesis
Flops counter for convolutional networks in pytorch framework
[ECCV 2024, Oral] DynamiCrafter: Animating Open-domain Images with Video Diffusion Priors
A curated list of image inpainting and video inpainting papers and resources
PixArt-Σ: Weak-to-Strong Training of Diffusion Transformer for 4K Text-to-Image Generation
A PyTorch library for implementing flow matching algorithms, featuring continuous and discrete flow matching implementations. It includes practical examples for both text and image modalities.
[ECCV2024] Video Foundation Models & Data for Multimodal Understanding
[CVPR 2023] Official implementation of the paper "Mask DINO: Towards A Unified Transformer-based Framework for Object Detection and Segmentation"
Paint by Example: Exemplar-based Image Editing with Diffusion Models
[CVPR2024 Highlight]GLEE: General Object Foundation Model for Images and Videos at Scale
A minimal and universal controller for FLUX.1.
[ICLR 2024] Official PyTorch implementation of FasterViT: Fast Vision Transformers with Hierarchical Attention