
Starred repositories
Code for Reinforcement Learning from Vision Language Foundation Model Feedback
Solve Visual Understanding with Reinforced VLMs
Benchmarking Knowledge Transfer in Lifelong Robot Learning
A very simple GRPO implement for reproducing r1-like LLM thinking.
MichalZawalski / embodied-CoT
Forked from openvla/openvlaEmbodied Chain of Thought: A robotic policy that reason to solve the task.
Witness the aha moment of VLM with less than $3.
This is a replicate of DeepSeek-R1-Zero and DeepSeek-R1 training on small models with limited data
🤗 LeRobot: Making AI for Robotics more accessible with end-to-end learning
The reinforcement learning training code for AgiBot X1.
train llama on a single A100 80G node using 🤗 transformers and 🚀 Deepspeed Pipeline Parallelism
Emma-X: An Embodied Multimodal Action Model with Grounded Chain of Thought and Look-ahead Spatial Reasoning
Official code for the paper: Embodied Multi-Modal Agent trained by an LLM from a Parallel TextWorld
The repository provides code for running inference with the SegmentAnything Model (SAM), links for downloading the trained model checkpoints, and example notebooks that show how to use the model.
ALFWorld: Aligning Text and Embodied Environments for Interactive Learning
🤗 Diffusers: State-of-the-art diffusion models for image, video, and audio generation in PyTorch and FLAX.
[ICML 2024] 3D-VLA: A 3D Vision-Language-Action Generative World Model
[IROS24 Oral]ManipVQA: Injecting Robotic Affordance and Physically Grounded Information into Multi-Modal Large Language Models
A simulation framework based on ROS2 and LLMs(like GPT) for robot interaction tasks in the era of large models
SAPIEN Manipulation Skill Framework, an open source GPU parallelized robotics simulator and benchmark, led by Hillbot, Inc.
Get up and running with Llama 3.3, DeepSeek-R1, Phi-4, Gemma 2, and other large language models.
TensorFlow for macOS 11.0+ accelerated using Apple's ML Compute framework.
Bullet Physics SDK: real-time collision detection and multi-physics simulation for VR, games, visual effects, robotics, machine learning etc.
Ravens partially ported code from Keras/Tensorflow to Pytorch.
Generating Robotic Simulation Tasks via Large Language Models
Train robotic agents to learn pick and place with deep learning for vision-based manipulation in PyBullet. Transporter Nets, CoRL 2020.
TCC-IRoNL is a novel framework that leverages large language models (LLMs) and multi-model vision-language models (VLMs) to enable ROS-based autonomous robots to interact with humans or other entit…
Collections of robotics environments geared towards benchmarking multi-task and meta reinforcement learning
Evaluating and reproducing real-world robot manipulation policies (e.g., RT-1, RT-1-X, Octo) in simulation under common setups (e.g., Google Robot, WidowX+Bridge) (CoRL 2024)
Unified Efficient Fine-Tuning of 100+ LLMs & VLMs (ACL 2024)