Skip to content
forked from smarthi/rl4j

Deep Reinforcement Learning for the JVM (Deep-Q, A3C)

License

Notifications You must be signed in to change notification settings

UdayKumarUK/rl4j

This branch is 163 commits ahead of smarthi/rl4j:master.

Folders and files

NameName
Last commit message
Last commit date

Latest commit

df4c7b2 · Nov 1, 2017
Apr 6, 2017
Mar 22, 2017
Aug 12, 2017
Aug 12, 2017
Oct 17, 2017
Aug 12, 2017
Aug 12, 2017
Sep 8, 2017
Aug 4, 2017
Aug 12, 2016
Sep 8, 2017
Aug 21, 2016
Aug 21, 2016
Sep 8, 2017
Aug 10, 2017
Nov 1, 2017
Aug 22, 2016

Repository files navigation

RL4J

RL4J is a reinforcement learning framework integrated with deeplearning4j and released under an Apache 2.0 open-source license. By contributing code to this repository, you agree to make your contribution available under an Apache 2.0 license.

  • DQN (Deep Q Learning with double DQN)
  • Async RL (A3C, Async NStepQlearning)

Both for Low-Dimensional (array of info) and high-dimensional (pixels) input.

DOOM

Cartpole

Here is a useful blog post I wrote to introduce you to reinforcement learning, DQN and Async RL:

Blog post

Examples

Cartpole example

Disclaimer

This is a tech preview and distributed as is. Comments are welcome on our gitter channel: gitter

Quickstart

** INSTALL rl4j-api before installing all (see below)!**

  • mvn install -pl rl4j-api
  • [if you want rl4j-gym too] Download and mvn install: gym-java-client
  • mvn install

Visualisation

webapp-rl4j

Quicktry cartpole:

Doom

Doom is not ready yet but you can make it work if you feel adventurous with some additional steps:

  • You will need vizdoom, compile the native lib and move it into the root of your project in a folder
  • export MAVEN_OPTS=-Djava.library.path=THEFOLDEROFTHELIB
  • mvn compile exec:java -Dexec.mainClass="YOURMAINCLASS"

Malmo (Minecraft)

Malmo

  • Download and unzip Malmo from here
  • export MALMO_HOME=YOURMALMO_FOLDER
  • export MALMO_XSD_PATH=$MALMO_HOME/Schemas
  • launch malmo per instructions
  • run with this main

WIP

  • Documentation
  • Serialization/Deserialization (load save)
  • Compression of pixels in order to store 1M state in a reasonnable amount of memory
  • Async learning: A3C and nstep learning (requires some missing features from dl4j (calc and apply gradients)).

Author

Ruben Fiszel

Proposed contribution area:

  • Continuous control
  • Policy Gradient
  • Update gym-java-client when gym-http-api gets compatible with pixels environments to play with Pong, Doom, etc ..

About

Deep Reinforcement Learning for the JVM (Deep-Q, A3C)

Resources

License

Stars

Watchers

Forks

Packages

No packages published

Languages

  • Java 99.1%
  • Shell 0.9%