cuijh26

Follow

🤗

Jiahao Cui cuijh26

🤗

Follow

21 followers · 85 following

Fusion Lab
China
https://cuijh26.github.io/

Achievements

Achievements

Lists (1)

Sort

awesome

17 repositories

Stars

SkyworkAI / Vitron

NeurIPS 2024 Paper: A Unified Pixel-level Vision LLM for Understanding, Generating, Segmenting, Editing

Python 479 28 Updated Oct 20, 2024

andypinxinliu / GestureLSM

Python 9 1 Updated Feb 5, 2025

TencentARC / ColorFlow

The official implementation of paper "ColorFlow: Retrieval-Augmented Image Sequence Colorization"

Python 355 29 Updated Dec 23, 2024

hkust-nlp / simpleRL-reason

This is a replicate of DeepSeek-R1-Zero and DeepSeek-R1 training on small models with limited data

Python 2,385 180 Updated Feb 7, 2025

Physical-Intelligence / openpi

Python 1,803 126 Updated Feb 9, 2025

Haochen-Wang409 / ross

[ICLR 2025] Reconstructive Visual Instruction Tuning

Python 52 2 Updated Jan 23, 2025

NExT-GPT / NExT-GPT

Code and models for NExT-GPT: Any-to-Any Multimodal Large Language Model

Python 3,410 343 Updated Nov 3, 2024

bytedance / CascadeV

DiT for VAE (and Video Generation)

Python 32 3 Updated Sep 2, 2024

Huage001 / LinFusion

Official PyTorch and Diffusers Implementation of "LinFusion: 1 GPU, 1 Minute, 16K Image"

Python 288 18 Updated Dec 23, 2024

VARGPT-family / VARGPT

VARGPT: Unified Understanding and Generation in a Visual Autoregressive Multimodal Large Language Model

Python 93 5 Updated Feb 9, 2025

EvolvingLMMs-Lab / open-r1-multimodal

A fork to add multimodal model training to open-r1

Python 502 26 Updated Feb 8, 2025

deepseek-ai / Janus

Janus-Series: Unified Multimodal Understanding and Generation Models

Python 15,424 2,014 Updated Feb 1, 2025

thu-nics / ViDiT-Q

[ICLR'25] ViDiT-Q: Efficient and Accurate Quantization of Diffusion Transformers for Image and Video Generation

Python 48 6 Updated Feb 10, 2025

xg-chu / GAGAvatar

[NeurIPS 2024] Generalizable and Animatable Gaussian Head Avatar

Python 416 37 Updated Nov 29, 2024

bytedance / X-Dyna

[ArXiv 2024] X-Dyna: Expressive Dynamic Human Image Animation

Python 150 12 Updated Jan 30, 2025

lucidrains / transfusion-pytorch

Pytorch implementation of Transfusion, "Predict the Next Token and Diffuse Images with One Multi-Modal Model", from MetaAI

Python 936 41 Updated Feb 1, 2025

hao-ai-lab / FastVideo

FastVideo is a lightweight framework for accelerating large video diffusion models.

Python 980 59 Updated Feb 7, 2025

thuhcsi / S2G-MDDiffusion

Python 83 4 Updated Jul 8, 2024

wangzhiyaoo / SVFR

Official implementation of SVFR.

Python 699 65 Updated Jan 19, 2025

LTH14 / mar

PyTorch implementation of MAR+DiffLoss https://arxiv.org/abs/2406.11838

Python 1,266 68 Updated Sep 27, 2024

showlab / videollm-online

VideoLLM-online: Online Video Large Language Model for Streaming Video (CVPR 2024)

Python 302 36 Updated Aug 15, 2024

dvlab-research / MagicMirror

Magic Mirror: ID-Preserved Video Generation in Video Diffusion Transformers

102 3 Updated Jan 13, 2025

feizc / Ingredients

Blending Custom Photos with Video Diffusion Transformers

Python 42 1 Updated Jan 21, 2025

HolmesShuan / FireFlow-Fast-Inversion-of-Rectified-Flow-for-Image-Semantic-Editing

An 8-step inversion and 8-step editing process works effectively with the FLUX-dev model. (3x speedup with results that are comparable or even superior to baseline methods)

Python 222 12 Updated Jan 25, 2025

NVIDIA / Cosmos

Cosmos is a world model development platform that consists of world foundation models, tokenizers and video processing pipeline to accelerate the development of Physical AI at Robotics & AV labs. C…

Python 7,413 464 Updated Jan 28, 2025

LMM101 / Awesome-Multimodal-Next-Token-Prediction

[Survey] Next Token Prediction Towards Multimodal Intelligence: A Comprehensive Survey

338 9 Updated Jan 17, 2025

Lightning-AI / litgpt

20+ high-performance LLMs with recipes to pretrain, finetune and deploy at scale.

Python 11,490 1,148 Updated Feb 3, 2025

cumulo-autumn / StreamDiffusion

StreamDiffusion: A Pipeline-Level Solution for Real-Time Interactive Generation

Python 9,982 736 Updated Dec 4, 2024

fudan-generative-vision / hallo3

Hallo3: Highly Dynamic and Realistic Portrait Image Animation with Diffusion Transformer Networks

Python 1,022 135 Updated Jan 29, 2025

fudan-generative-vision / hallo2

Hallo2: Long-Duration and High-Resolution Audio-driven Portrait Image Animation

Python 3,466 502 Updated Jan 24, 2025