-
PKU-YuanGroup
- Shenzhen
- https://shyuanbest.github.io/
- @shenghai_y55451
- https://huggingface.co/BestWishYsh
Stars
Blending Custom Photos with Video Diffusion Transformers
Memory-optimized training scripts for video models based on Diffusers
Timestep Embedding Tells: It's Time to Cache for Video Diffusion Model
[ECCV 2024] Bridging Different Language Models and Generative Vision Models for Text-to-Image Generation
OmniGen: Unified Image Generation. https://arxiv.org/pdf/2409.11340
ELLA: Equip Diffusion Models with LLM for Enhanced Semantic Alignment
FastVideo is a lightweight framework for accelerating large video diffusion models.
A pipeline parallel training script for diffusion models.
xDiT: A Scalable Inference Engine for Diffusion Transformers (DiTs) with Massive Parallelism
A Large-Scale High-Quality Dataset for Enhancing Human-Centric Video Generation
HunyuanVideo: A Systematic Framework For Large Video Generation Model
【COLING 2025🔥】Code for the paper "Is Parameter Collision Hindering Continual Learning in LLMs?".
LLaVA-CoT, a visual language model capable of spontaneous, systematic reasoning
Identity-Preserving Text-to-Video Generation by Frequency Decomposition
Fundamentals of Digital Media Technology(04713901) | Peking University ECE Course Materials
Face analysis tools for modern research, equipped with state-of-the-art Face Parsing and Face Alignment
📹 A more flexible CogVideoX that can generate videos at any resolution and creates videos from images.
Experiencing lightning fast (~1s) and accurate drag-based image editing
Code and Data for "GenAI Arena: An Open Evaluation Platform for Generative Models" [NeurIPS 2024]
Pytorch implementation of Transfusion, "Predict the Next Token and Diffuse Images with One Multi-Modal Model", from MetaAI
Enhancing Video VAE by Wavelet-Driven Energy Flow for Latent Video Diffusion Model
The repository provides code for running inference with the Meta Segment Anything Model 2 (SAM 2), links for downloading the trained model checkpoints, and example notebooks that show how to use th…
📺 An End-to-End Solution for High-Resolution and Long Video Generation Based on Transformer Diffusion
CLIP+MLP Aesthetic Score Predictor