Project Websites:
This repo contains the implementation of ACT, together with 2 simulated environments:
- Transfer Cube
- Bimanual Insertion.
You can train and evaluate ACT in simulation or on real hardware. For real hardware, you would also need to install ALOHA.
- act
- act
detr
Model definitions of ACT, modified from DETRpolicy.py
An adaptor for ACT policysim_env.py
Mujoco + DM_Control environments with joint space controlee_sim_env.py
Mujoco + DM_Control environments with EE space controlscripted_policy.py
Scripted policies for sim environmentsconstants.py
Constants shared across filesutils.py
Utils such as data loading and helper functions
- scripts
imitate_episodes.py
Train and Evaluate ACTrecord_sim_episodes.py
Record episodes using the simulatorvisualize_episodes.py
Save videos from a .hdf5 dataset
- act
There are two recommended ways to install ACT: using conda
or venv
.
Using venv
is preferred due its ease of use against frameworks like ROS.
sudo apt-get install python3-venv
python3 -m venv ~/aloha # creates a venv "aloha" in the home directory, can be created anywhere
source ~/aloha/bin/venv
pip install dm_control==1.0.14
pip install einops
pip install h5py
pip install ipython
pip install matplotlib
pip install mujoco==2.3.7
pip install opencv-python
pip install packaging
pip install pexpect
pip install pyquaternion
pip install pyyaml
pip install rospkg
pip install torch
pip install torchvision
cd /path/to/act/detr && pip install -e .
conda create -n aloha python=3.8.10
conda activate aloha
pip install dm_control==1.0.14
pip install einops
pip install h5py
pip install ipython
pip install matplotlib
pip install mujoco==2.3.7
pip install opencv-python
pip install packaging
pip install pexpect
pip install pyquaternion
pip install pyyaml
pip install rospkg
pip install torch
pip install torchvision
cd /path/to/act/detr && pip install -e .
conda env create --file=conda_env.yaml
To set up a new terminal, run:
source ~/aloha/bin/activate # or conda activate aloha
cd /path/to/act
We use sim_transfer_cube_scripted
task in the examples below.
Another option is sim_insertion_scripted
.
To generate 50 episodes of scripted data, run:
python3 record_sim_episodes.py \
--task_name sim_transfer_cube_scripted \
--dataset_dir <data save dir> \
--num_episodes 50
To can add the flag --onscreen_render
to see real-time rendering.
To visualize the episode after it is collected, run
python3 visualize_episodes.py \
--dataset_dir <data save dir> \
--episode_idx 0
To train ACT:
# Transfer Cube task
python3 imitate_episodes.py \
--task_name sim_transfer_cube_scripted \
--ckpt_dir <ckpt dir> \
--policy_class ACT \
--kl_weight 10 \
--chunk_size 100 \
--hidden_dim 512 \
--batch_size 8 \
--dim_feedforward 3200 \
--num_epochs 2000 \
--lr 1e-5 \
--seed 0
To evaluate the policy, run the same command but add --eval
.
This loads the best validation checkpoint.
The success rate should be around 90% for transfer cube, and around 50% for insertion.
To enable temporal ensembling, add flag --temporal_agg
.
Videos will be saved to <ckpt_dir>
for each rollout.
You can also add --onscreen_render
to see real-time rendering during evaluation.
For real-world data where things can be harder to model, train for at least 5000 epochs or 3-4 times the length after the loss has plateaued. Please refer to tuning tips for more info.
You can find all scripted/human demo for simulated environments here.
New: ACT tuning tips
TL;DR: if your ACT policy is jerky or pauses in the middle of an episode, just train for longer! Success rate and smoothness can improve way after loss plateaus.