Skip to content

Abstractions, algorithms, and utilities for reinforcement learning in Julia

License

Notifications You must be signed in to change notification settings

wookay/Reinforce.jl

 
 

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

73 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Reinforce (WIP)

Build Status Gitter

Reinforce.jl is an interface for Reinforcement Learning. It is intended to connect modular environments, policies, and solvers with a simple interface.


Packages which build on Reinforce:


New environments are created by subtyping AbstractEnvironment and implementing a few methods:

  • reset!(env)
  • actions(env, s) --> A
  • step!(env, s, a) --> r, s′
  • finished(env, s′)

and optional overrides:

  • state(env) --> s
  • reward(env) --> r

which map to env.state and env.reward respectively when unset.

TODO: more details and examples


Agents/policies are created by subtyping AbstractPolicy and implementing action. The built-in random policy is a short example:

type RandomPolicy <: AbstractPolicy end
action(policy::RandomPolicy, r, s′, A′) = rand(A′)

The action method maps the last reward and current state to the next chosen action: (r, s′) --> a′.


Iterate through episodes using the Episode iterator. The convenience method episode! demonstrates this:

function episode!(env, policy = RandomPolicy(); stepfunc = on_step, kw...)
	ep = Episode(env, policy; kw...)
	for sars in ep
		stepfunc(env, ep.niter, sars)
	end
	ep.total_reward, ep.niter
end

A 4-tuple (s,a,r,s′) is returned from each step of the episode. Whether we write r or r′ is a matter of convention.

About

Abstractions, algorithms, and utilities for reinforcement learning in Julia

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • Julia 100.0%