-
Peking University
Highlights
- Pro
Lists (3)
Sort Name descending (Z-A)
Stars
Official implementation of "MIMO: Controllable Character Video Synthesis with Spatial Decomposed Modeling"
WebUI extension for ControlNet
Nightly release of ControlNet 1.1
Select a portrait, click to move the head around (please use your own space / GPU!)
[ECCV 2024, Oral] DynamiCrafter: Animating Open-domain Images with Video Diffusion Priors
Visualize PyTorch tensors with a single line of code.
Fundamentals of Digital Media Technology(04713901) | Peking University ECE Course Materials
Code for "GVHMR: World-Grounded Human Motion Recovery via Gravity-View Coordinates", Siggraph Asia 2024
text and image to video generation: CogVideoX (2024) and CogVideo (ICLR 2023)
🔖 Curated list of video object segmentation (VOS) papers, datasets, and projects.
[CVPR 2024] Code release for "InstanceDiffusion: Instance-level Control for Image Generation"
Officail Implementation for "Cross-Image Attention for Zero-Shot Appearance Transfer"
Diffusion Model-Based Image Editing: A Survey (arXiv)
Grounded SAM 2: Ground and Track Anything in Videos with Grounding DINO, Florence-2 and SAM 2
【ICLR 2024🔥】 Extending Video-Language Pretraining to N-modality by Language-based Semantic Alignment
AL-Ref-SAM 2: Unleashing the Temporal-Spatial Reasoning Capacity of GPT for Training-Free Audio and Language Referenced Video Object Segmentation
Official implementation of MotionClone: Training-Free Motion Cloning for Controllable Video Generation
[ICLR 2024] Official pytorch implementation of "ControlVideo: Training-free Controllable Text-to-Video Generation"
DECA: Detailed Expression Capture and Animation (SIGGRAPH 2021)
The repository provides code for running inference with the Meta Segment Anything Model 2 (SAM 2), links for downloading the trained model checkpoints, and example notebooks that show how to use th…
[CVPR2024] MotionEditor is the first diffusion-based model capable of video motion editing.
Ground-A-Video: Zero-shot Grounded Video Editing using Text-to-image Diffusion Models (ICLR 2024)
[ECCV 2024] DragAnything: Motion Control for Anything using Entity Representation
Video-P2P: Video Editing with Cross-attention Control
[CVPR24] Official Implementation of 'A Video is Worth 256 Bases: Spatial-Temporal Expectation-Maximization Inversion for Zero-Shot Video Editing'