Lists (1)
Sort Name ascending (A-Z)
Stars
CoTracker is a model for tracking any point (pixel) on a video.
Cosmos is a world model development platform that consists of world foundation models, tokenizers and video processing pipeline to accelerate the development of Physical AI at Robotics & AV labs. C…
A generative world for general-purpose robotics & embodied AI learning.
An updated version of Nishanth's pbrspot library, added aligned body frame and hand frame with the real Spot services.
Grounded SAM 2: Ground and Track Anything in Videos with Grounding DINO, Florence-2 and SAM 2
The repository provides code for running inference with the Meta Segment Anything Model 2 (SAM 2), links for downloading the trained model checkpoints, and example notebooks that show how to use th…
Official repository of "SAMURAI: Adapting Segment Anything Model for Zero-Shot Visual Tracking with Motion-Aware Memory"
LogiCity@NeurIPS'24, D&B track. A multi-agent inductive learning environment for "abstractions".
No Pose, No Problem: Surprisingly Simple 3D Gaussian Splats from Sparse Unposed Images
MoGe: Unlocking Accurate Monocular Geometry Estimation for Open-Domain Images with Optimal Training Supervision
Summary of key papers and blogs about diffusion models to learn about the topic. Detailed list of all published diffusion robotics papers.
code for the paper Predicting Point Tracks from Internet Videos enables Diverse Zero-Shot Manipulation
Official repository of Learning to Act from Actionless Videos through Dense Correspondences.
This code corresponds to simulation environments used as part of the MimicGen project.
Code of Pyramidal Flow Matching for Efficient Video Generative Modeling
This is the official implementation of Video Generation part of This&That: Language-Gesture Controlled Video Generation for Robot Planning (ICRA 2025)
text and image to video generation: CogVideoX (2024) and CogVideo (ICLR 2023)
Stable Video Diffusion Training Code and Extensions.
Heterogeneous Pre-trained Transformer (HPT) as Scalable Policy Learner.
A list of awesome and popular robot learning environments
Official implementation of RAM: Retrieval-Based Affordance Transfer for Generalizable Zero-Shot Robotic Manipulation
Pandora: Towards General World Model with Natural Language Actions and Video States
NeuroNCAP benchmark for end-to-end autonomous driving
[CVPR 2024 - Oral] Matching 2D Images in 3D: Metric Relative Pose from Metric Correspondences