Copyright © 2022 Intelligent Driving Laboratory (iDLab). All rights reserved.
Optimal control is an important theoretical framework for sequential decision-making and control of industrial objects, especially for complex and high-dimensional problems with strong nonlinearity, high randomness, and multiple constraints. Solving the optimal control input is the key to applying this theoretical framework to practical industrial problems. Taking Model Predictive Control as an example, computation time solving its control input relies on receding horizon optimization, of which the real-time performance greatly restricts the application and promotion of this method. In order to solve this problem, iDLab has developed a series of full state space optimal strategy solution algorithms and the set of application toolchain for industrial control based on Reinforcement Learning and Approximate Dynamic Programming theory. The basic principle of this method takes an approximation function (such as neural network) as the policy carrier, and improves the online real-time performance of optimal control by offline solving and online application. The GOPS toolchain will cover the following main links in the whole industrial control process, including control problem modeling, policy network training, offline simulation verification, controller code deployment, etc. GOPS currently supports the following algorithms:
- Deep Q Network (DQN)
- Deep Deterministic Policy Gradient (DDPG)
- Twin Delayed DDPG (TD3)
- Asynchronous Advantage Actor-Critic (A3C)
- Soft Actor-Critic (SAC)
- Distributional Soft Actor-Critic (DSAC)
- Trust Region Policy Optimization (TRPO)
- Proximal Policy Optimization (PPO)
- Infinite-Horizon Approximate Dynamic Programming (INFADP)
- Finite-Horizon Approximate Dynamic Programming (FHADP)
- Mixed Actor-Critic (MAC)
- Mixed Policy Gradient (MPG)
- Separated Proportional-Integral Lagrangian (SPIL)
GOPS requires:
- Windows 7 or greater or Linux.
- Python 3.6 or greater (GOPS V1.0 precompiled Simulink models use Python 3.8). We recommend using Python 3.8.
- (Optional) Matlab/Simulink 2018a or greater.
- The installation path must be in English.
You can install GOPS through the following steps:
- clone GOPS repository
git clone https://github.com/Intelligent-Driving-Laboratory/GOPS.git
cd GOPS
- create conda environment depending on your OS:
conda env create -f gops_environment.nix.yml # for Linux
conda activate gops
or
conda env create -f gops_environment.win.yml # for Windows
conda activate gops
- install GOPS
pip install -e .
- (Optional) if you plan to use the MPC-based optimal controller implemented in GOPS, install
cyipopt
:
conda install -c conda-forge cyipopt
The tutorials and API documentation are hosted on gops.readthedocs.io.
This is an example of running finite-horizon Approximate Dynamic Programming (FHADP) on inverted double pendulum environment. Train the policy by running:
python example_train/fhadp/fhadp_mlp_idpendulum_serial.py
After training, test the policy by running:
python example_run/run_idp_fhadp.py
You can record a video by setting save_render=True
in the test file. Here is a video of running a trained policy on the task:
idp.mp4
In order to make it easier for everyone to use GOPS and build a good community, we have established a WeChat group for GOPS users and invite interested users to join by scanning the QR code below. Developers will answer questions for users in the group when using GOPS, and will fix problems in GOPS based on user feedback. In addition, the release of a new version of GOPS will also be notified in the group.
Thanks to all users for your support of GOPS and to all developers for your contributions to GOPS. Let's work together to make GOPS a valuable, easy-to-use, and popular software!