GitHub

Reinforcement Learning Package

This package provides an easy way of applying policy gradient into an environment. It was designed to work with neural networks from TensorFlow, but one may feel invited to extend to Pytorch models as well. There is a class in which the algorithm is parallelized into as many cpus as one may want it.

We used a reward discount policy using a bit of linear algebra. THe factor depends on the problem and its score dependency on the previous actions.

Workflow:
Let the agent explore the environment with the neural network
every N episodes:
Apply the reward discount policy in each episode
Normalize the rewards of the episodes accounting for all of them in the normalization
Create a weighted average of the gradient using the normalized rewards
apply the gradient in the NN using the chosen optimizer\
Repeat util the number of episodes aimed is reached.\

Results

Lunar Landing

Car Racing

Package Structure.

Most of the functions inside the Utils package were used as a static method inside the BasePolicyGradient Class (which is encompassed on the BasePolicyGradine.py file) The Test package contains a very simple test to check if the class is working well.

PolicyGradient/
|---- Base/
|        |
|        |---__init__.py
|        |---BasePolicyGradine.py
|-----Utils/
|        |
|        |---__init__.py
|        |---UtilsGrads.py
|        |---UtilsRewards.py
|        |---UtilsMetrics.py
|        |---UtilsSaving.py
|-----Tests/
|        |
|        |---__init__.py
|        |---BasePolicyGradient--.py
|        |---PolicyGradientParalel--.py
|        |---PolicyGradientParalel--.py
|
|---PolicyGradient.py
|---PolicyGradientParalel.py

Name		Name	Last commit message	Last commit date
Latest commit History 16 Commits
PolicyGradient		PolicyGradient
gifs		gifs
CarRacing.py		CarRacing.py
lunarLanding.py		lunarLanding.py
policy_gradient.py		policy_gradient.py
readme.md		readme.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Reinforcement Learning Package

Results

Lunar Landing

Car Racing

Package Structure.

About

Releases

Packages

Languages

estevanmendes/RL_Policy_Gradient

Folders and files

Latest commit

History

Repository files navigation

Reinforcement Learning Package

Results

Lunar Landing

Car Racing

Package Structure.

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages