Stars
RDT-1B: a Diffusion Foundation Model for Bimanual Manipulation
Use PEFT or Full-parameter to finetune 400+ LLMs (Qwen2.5, Llama3.2, GLM4, Internlm2.5, Yi1.5, Mistral, Baichuan2, DeepSeek, ...) and 150+ MLLMs (Qwen2-VL, Qwen2-Audio, Llama3.2-Vision, Llava, Inte…
A curated list of awesome papers on Embodied AI and related research/industry-driven resources.
【ICLR 2024🔥】 Extending Video-Language Pretraining to N-modality by Language-based Semantic Alignment
The world's largest GitHub Repository for LLMs + Robotics
[CVPR 2024] Official implementation of "VRP-SAM: SAM with Visual Reference Prompt"
[CVPR 2024] Depth Anything: Unleashing the Power of Large-Scale Unlabeled Data. Foundation Model for Monocular Depth Estimation
[CVPR 2024] Code for "Improving the Generalization of Segmentation Foundation Model under Distribution Shift via Weakly Supervised Adaptation"
[ICCV 2023 R6D] PyTorch implementation of CNOS: A Strong Baseline for CAD-based Novel Object Segmentation based on Segmenting Anything and DINOv2
[CVPR2024] Code for "SAM-6D: Segment Anything Model Meets Zero-Shot 6D Object Pose Estimation".
[NeurIPS 2022] Official code repository for "TANGO: Text-driven Photorealistic and Robust 3D Stylization via Lighting Decomposition"
(ICCV2023) official repository for "Fantasia3D: Disentangling Geometry and Appearance for High-quality Text-to-3D Content Creation"