Starred repositories
A latent text-to-image diffusion model
LAVIS - A One-stop Library for Language-Vision Intelligence
PyTorch code and models for the DINOv2 self-supervised learning method.
OmniGen: Unified Image Generation. https://arxiv.org/pdf/2409.11340
EfficientSAM: Leveraged Masked Image Pretraining for Efficient Segment Anything
Codebase for Aria - an Open Multimodal Native MoE
EntitySeg Toolbox: Towards Open-World and High-Quality Image Segmentation
[CVPR 2024] Alpha-CLIP: A CLIP Model Focusing on Wherever You Want
[ECCV 2024] Tokenize Anything via Prompting
This repo is designed for General Robotic Operation System
The official repo for the paper "In-Context Imitation Learning via Next-Token Prediction"
PerlDiff: Controllable Street View Synthesis Using Perspective-Layout Diffusion Models