Skip to content

Latest commit

 

History

History
273 lines (207 loc) · 11.2 KB

README.md

File metadata and controls

273 lines (207 loc) · 11.2 KB

image

Streamlining the Transfer of Simulated Robot Learning to the Real-World

license codestyle Documentation Status Continuous Integration Maintainability Test Coverage

What is EAGERx

You can use EAGERx (Engine Agnostic Graph Environments for Robotics) to easily define new (Gymnasium compatible) environments with modular robot definitions.

It enables users to:

  • Define environments as graphs of nodes
  • Visualize these graph environments interactively in a GUI
  • Use a single graph environment both in reality and with various simulators

EAGERx explicitly addresses the differences in learning between simulation and reality, with native support for essential features such as:

  • Safety layers and various other state, action and time-scale abstractions
  • Delay simulation & domain randomization
  • Real-world reset routines
  • Synchronized parallel computation within a single environment

Full documentation and tutorials available here.

pendulum_sim pendulum_real box_sim box_real

cf_real

Sim2Real: Policies trained in simulation and zero-shot evaluated on real systems using EAGERx. In the top left the successful transfer of a policy for the classic pendulum swing-up problem is shown and in the top right for a box-pushing task. Below that a policy to land a quadrotor on a moving inclined platform is shown.

Modular: The modular design of EAGERx allows users to create complex environments easily through composition.

GUI: Users can visualize their graph environment. Here we visualize the graph environment that we built in this tutorial. See the documentation for more information.

Live plotting: In robotics it is crucial to monitor the robot's behavior during the learning process. Luckily, inter-node communication within EAGERx can be listened to externally, so that any relevant information stream can be trivially monitored on-demand. See the documentation for more information.

use_case swim_sim swim_real

Applications beyond RL: The modular design and unified software pipeline of the framework have utility beyond reinforcement learning. We explored two such instances: interactive language-conditioned imitation learning (left) and classical control with deep learning based perception in a swimming pool environment (right).

Installation

You can do a minimal installation of EAGERx with:

pip3 install eagerx

We provide other options (Docker, Conda) for installing EAGERx in the documentation.

Tutorials

The following tutorials are currently available in the form of Google Colabs:

Introduction to EAGERx

The solutions are available here.

Developer tutorials

The solutions are available here.

For more information see the docs or the eagerx_tutorials package.

Code Example

Below you can find a code example of environment creation and training using Stable-Baselines3. To run this code, you should install eagerx_tutorials, which can be done by running:

pip3 install eagerx_tutorials

Detailed explanation of the code can be found in this Colab tutorial.

import eagerx
from eagerx.backends.single_process import SingleProcess
from eagerx.wrappers import Flatten
from eagerx_tutorials.pendulum.objects import Pendulum
from eagerx_ode.engine import OdeEngine

import stable_baselines3 as sb3
import numpy as np
from typing import Dict


class PendulumEnv(eagerx.BaseEnv):
    def __init__(self, name: str, rate: float, graph: eagerx.Graph, engine: eagerx.specs.EngineSpec,
                 backend: eagerx.specs.BackendSpec):
        self.max_steps = 100
        self.steps = None
        super().__init__(name, rate, graph, engine, backend, force_start=True)

    def step(self, action: Dict):
        observation = self._step(action)
        self.steps += 1

        # Calculate reward and check if the episode is terminated
        th = observation["angle"][0]
        thdot = observation["angular_velocity"][0]
        u = float(action["voltage"])
        th -= 2 * np.pi * np.floor((th + np.pi) / (2 * np.pi))
        cost = th ** 2 + 0.1 * thdot ** 2 + 0.01 * u ** 2
        truncated = self.steps > self.max_steps
        terminated = False

        # Render
        if self.render_mode == "human":
            self.render()
        return observation, -cost, terminated, truncated, {}

    def reset(self, seed=None, options=None) -> Dict:
        states = self.state_space.sample()
        observation = self._reset(states)
        self.steps = 0
        # Render
        if self.render_mode == "human":
            self.render()
        return observation, {}

if __name__ == "__main__":
    rate = 30.0

    pendulum = Pendulum.make("pendulum", actuators=["u"], sensors=["theta", "theta_dot"], states=["model_state"])

    graph = eagerx.Graph.create()
    graph.add(pendulum)
    graph.connect(action="voltage", target=pendulum.actuators.u)
    graph.connect(source=pendulum.sensors.theta, observation="angle")
    graph.connect(source=pendulum.sensors.theta_dot, observation="angular_velocity")

    engine = OdeEngine.make(rate=rate)
    backend = SingleProcess.make()

    env = PendulumEnv(name="PendulumEnv", rate=rate, graph=graph, engine=engine, backend=backend)
    env = Flatten(env)

    model = sb3.SAC("MlpPolicy", env, verbose=1)
    model.learn(total_timesteps=int(150 * rate))

    env.shutdown()

Engines

EAGERx allows to create engine agnostic environments such that a single environment can be used for simulation and reality. The following engines are available for training and evaluation:

Users can also create their own (custom) engines.

Cite EAGERx

If you are using EAGERx for your scientific publications, please cite:

@article{vanderheijden2024eagerx,
  author={van der Heijden, Bas and Luijkx, Jelle and Ferranti, Laura and Kober, Jens and Babuska, Robert},
  journal={IEEE Robotics \& Automation Magazine}, 
  title={Engine Agnostic Graph Environments for Robotics (EAGERx): A Graph-Based Framework for Sim2real Robot Learning}, 
  year={2024},
  volume={},
  number={},
  pages={2-15},
  keywords={Robots;Engines;Robot sensing systems;Delays;Robot learning;Physics;Codes},
  doi={10.1109/MRA.2024.3433172}
}

Maintainers

EAGERx is currently maintained by Bas van der Heijden (@bheijden) and Jelle Luijkx (@jelledouwe).

How to contact us

For any question, send an e-mail to [email protected].

Acknowledgements

EAGERx is funded by the OpenDR Horizon 2020 project.

tu_delft opendr