Stars
[ICLR25] High-performance Image Tokenizers for VAR and AR
MoGe: Unlocking Accurate Monocular Geometry Estimation for Open-Domain Images with Optimal Training Supervision
[CVPR 2024] Probing the 3D Awareness of Visual Foundation Models
PyTorch code and models for V-JEPA self-supervised learning from video.
VGGSfM: Visual Geometry Grounded Deep Structure From Motion
End-to-End Object Detection with Transformers
Code release for "Omni3D A Large Benchmark and Model for 3D Object Detection in the Wild"
Nightly release of ControlNet 1.1
[NeurIPS 2024] Depth Anything V2. A More Capable Foundation Model for Monocular Depth Estimation
[CVPR 2024 Highlight] MIGC and [TPAMI 2024] MIGC++ (Official Implementation)
InstaGen: Enhancing Object Detection by Training on Synthetic Dataset, CVPR2024
[CVPR 2024] Code release for "InstanceDiffusion: Instance-level Control for Image Generation"
CLIP (Contrastive Language-Image Pretraining), Predict the most relevant text snippet given an image
Projet pour les cours d'optimisation
Try to build a quadruped robot in MATLAB Simulink.
《李宏毅深度学习教程》(李宏毅老师推荐👍,苹果书🍎),PDF下载地址:https://github.com/datawhalechina/leedl-tutorial/releases