DeepRL

Purpose

The implementation of deepRL agents for both discrete and continuous controls, with time series measurements as inputs.

Supported algorithms

Supported NN structures

fully-connected, 1d-convolutional, lstm (on-policy algorithms only).

Usage

Define all parameters in config.ini, and run python3 main.py --config-path [path to config.ini]. This is in run.sh Multi-processing implementation is at multiprocess branch, in which the global wt and local batch are maintained in queues. It is not as optimal as the multi-threading implementation due to the potential lag between the generation and consumpution of each local batch.

In config-path, the variable BASE DIR has the directory where results are present. Go there on your machine and it will have subfolders of log/, model/. To monitor progress on tensorboard, type python -m tensorflow.tensorboard --logdir=. which will launch tensorboard and you can monitor progress on a browser window. Some example plots are below.

Example results

A2C on selected gym environments.

Detailed config files are located under ./docs.

continuous control	discrete control
Pendulum	Acrobot

MountainCarContinuous	MountainCar

A2C, PPO, and DDPG on `MountainCarContinuous` environment.

The default 500 maximum episode length is too short for DDPG to explore a successful trace so it is relaxed to 2000 in this comparison.
The convergence comparison may not be meaningful since it mostly depends how fast the agent could explore a sucessful trace to get the sparse reward, under a random behavior policy.
DDPG training takes much longer time, as the in-memory replay buffer grows during training.

Name		Name	Last commit message	Last commit date
Latest commit History 71 Commits
agents		agents
docs		docs
envs		envs
README.md		README.md
main.py		main.py
run_RL.sh		run_RL.sh
train.py		train.py
train_LQR.sh		train_LQR.sh
utils.py		utils.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

DeepRL

Purpose

Supported algorithms

Supported NN structures

Usage

Example results

A2C on selected gym environments.

A2C, PPO, and DDPG on `MountainCarContinuous` environment.

About

Releases

Packages

Contributors 2

Languages

cts198859/deeprl

Folders and files

Latest commit

History

Repository files navigation

DeepRL

Purpose

Supported algorithms

Supported NN structures

Usage

Example results

A2C on selected gym environments.

A2C, PPO, and DDPG on MountainCarContinuous environment.

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Languages

A2C, PPO, and DDPG on `MountainCarContinuous` environment.

Packages