Lists (27)
Sort Name ascending (A-Z)
App
AR
Awesome
Course
current
Datasets
Detection & Segmentation
Draw
Face
Foundation Model
Framework
deep learning frameworkGenerative Models
HOI
html
Language Models
LLM
Low-level
Neural Render
Paper List
Retrieval
Roadmaps
self-supervise
SR
TeX
Tools
Video Deblurring
Webpage
Stars
Universal Monocular Metric Depth Estimation
Depth Pro: Sharp Monocular Metric Depth in Less Than a Second.
Janus-Series: Unified Multimodal Understanding and Generation Models
[ECCV‘24] Teaching Tailored to Talent: Adverse Weather Restoration via Prompt Pool and Depth-Anything Constraint
verl: Volcano Engine Reinforcement Learning for LLMs
A training framework for Stable Baselines3 reinforcement learning agents, with hyperparameter optimization and pre-trained agents included.
PyTorch version of Stable Baselines, reliable implementations of reinforcement learning algorithms.
[CVPR 2025] Official repo for ART:Anonymous Region Transformer for Variable Multi-Layer Transparent Image Generation
[CVPR 2025] This is an official inference code of the paper "BizGen: Advancing Article-level Visual Text Rendering for Infographics Generation" . Project page: https://bizgen-msra.github.io/
New repo collection for NVIDIA Cosmos: https://github.com/nvidia-cosmos
Vector (and Scalar) Quantization, in Pytorch
Official code for VMix: Improving Text-to-Image Diffusion Model with Cross-Attention Mixing Control
A framework streamlining Training, Finetuning, Evaluation and Deployment of Multi Modal Language models
Official PyTorch implementation of paper "CLEAR: Conv-Like Linearization Revs Pre-Trained Diffusion Transformers Up".
Taming Transformers for High-Resolution Image Synthesis
Region-Aware Text-to-Image Generation via Hard Binding and Soft Refinement 🔥
[CVPR 2025 Oral]Infinity ∞ : Scaling Bitwise AutoRegressive Modeling for High-Resolution Image Synthesis
[NeurIPS 2024 Best Paper][GPT beats diffusion🔥] [scaling laws in visual generation📈] Official impl. of "Visual Autoregressive Modeling: Scalable Image Generation via Next-Scale Prediction". An *ult…
Rectified Flow Inversion (RF-Inversion) - ICLR 2025
Official code for "RB-Modulation: Training-Free Personalization of Diffusion Models using Stochastic Optimal Control"
[CAAI AIR'24] Bilateral Reference for High-Resolution Dichotomous Image Segmentation
This is the official testing code of the baseline method presented at the CVPR 2023 NTIRE Real-Time 4K Super-Resolution Challenge. We provide model and pre-trained checkpoints.
A linear estimator on top of clip to predict the aesthetic quality of pictures