Stars
A collection of design patterns/idioms in Python
LAVIS - A One-stop Library for Language-Vision Intelligence
Automate docker image building without learning Dockerfiles
FaceChain is a deep-learning toolchain for generating your Digital-Twin.
[CVPR 2024] Depth Anything: Unleashing the Power of Large-Scale Unlabeled Data. Foundation Model for Monocular Depth Estimation
PyTorch Deformable Convolutional Networks v2 (no compile required)
Don't feel pain to use Deformable Convolution
A Windows/macOS/Linux GUI based on Clash
DCNv2 supports decent pytorch such as torch 1.5+ (now 1.8+)
🤗 Diffusers: State-of-the-art diffusion models for image and audio generation in PyTorch and FLAX.
[CVPR 2023 Highlight] InternImage: Exploring Large-Scale Vision Foundation Models with Deformable Convolutions
[ICCV 2023] DETRs with Collaborative Hybrid Assignments Training
[CVPR2023] MOTRv2: Bootstrapping End-to-End Multi-Object Tracking by Pretrained Object Detectors
This is an official implementation of our CVPR 2023 paper "Human Pose as Compositional Tokens" (https://arxiv.org/pdf/2303.11638.pdf)
UPGPT: Universal Diffusion Model for Person Image Generation, Editing and Pose Transfer
A collection of papers and codes for human pose transfer
(ICCV'21) Official code of "Dressing in Order: Recurrent Person Image Generation for Pose Transfer, Virtual Try-on and Outfit Editing" by Aiyu Cui, Daniel McKee and Svetlana Lazebnik
Build and share delightful machine learning apps, all in Python. 🌟 Star to support our work!
Official Code for DragGAN (SIGGRAPH 2023)
[CVPR2023] A Hierarchical Representation Network for Accurate and Detailed Face Reconstruction from In-The-Wild Images.
Official repository accompanying a CVPR 2022 paper EMOCA: Emotion Driven Monocular Face Capture And Animation. EMOCA takes a single image of a face as input and produces a 3D reconstruction. EMOCA …
DECA: Detailed Expression Capture and Animation (SIGGRAPH 2021)
Real-Time and Accurate Full-Body Multi-Person Pose Estimation&Tracking System
An open-source project dedicated to tracking and segmenting any objects in videos, either automatically or interactively. The primary algorithms utilized include the Segment Anything Model (SAM) fo…
High-Resolution Image Synthesis with Latent Diffusion Models
Stable Diffusion web UI
A latent text-to-image diffusion model