
Starred repositories
Enable macOS HiDPI and have a native setting.
The Large-scale Manipulation Platform for Scalable and Intelligent Embodied Systems
[CVPR 25 (Highlight)] RoboTwin: Dual-Arm Robot Benchmark with Generative Digital Twins
🤗 LeRobot: Making AI for Robotics more accessible with end-to-end learning
[CVPR 2025] Mr. DETR: Instructive Multi-Route Training for Detection Transformers
NVIDIA Isaac GR00T N1 is the world's first open foundation model for generalized humanoid robot reasoning and skills.
[CVPR 2024 & NeurIPS 2024] EmbodiedScan: A Holistic Multi-Modal 3D Perception Suite Towards Embodied AI
A large-scale benchmark and learning environment.
Extended LaTeX template for CVPR/ICCV papers
YOLO 3D Object Detection for Autonomous Driving Vehicle
OpenEMMA, a permissively licensed open source "reproduction" of Waymo’s EMMA model.
Octo is a transformer-based robot policy trained on a diverse mix of 800k robot trajectories.
SoFar: Language-Grounded Orientation Bridges Spatial Reasoning and Object Manipulation
CALVIN - A benchmark for Language-Conditioned Policy Learning for Long-Horizon Robot Manipulation Tasks
Solve Visual Understanding with Reinforced VLMs
[IV2024] MultiCorrupt: A benchmark for robust multi-modal 3D object detection, evaluating LiDAR-Camera fusion models in autonomous driving. Includes diverse corruption types (e.g., misalignment, mi…
This project is an experiment record manager for python based on SQLite DMS, which can help you efficiently save your experiment settings and results for later analysis.
Official code of "Robust Multimodal 3D Object Detection via Modality-Agnostic Decoding and Proximity-based Modality Ensemble"
PagePlug是 Appsmith 的中国化项目,基于Appsmith做了整体性能的优化及汉化,也集合了特色表单解决方案Formily组件、图表解决方案Echarts组件、低代码小程序开发等,是面向研发使用的一个开源的、声明式的前后端一体低代码Lowcode,项目逻辑主要是在前端的解释器和设计器上
HE-Drive: Human-Like End-to-End Driving with Vision Language Models
[NeurIPS 2024] SMART: Scalable Multi-agent Real-time Motion Generation via Next-token Prediction
[ECCV 2024] SimPB: A Single Model for 2D and 3D Object Detection from Multiple Cameras
Flash Attention in ~100 lines of CUDA (forward pass only)
A unified interface to many trajectory forecasting datasets.