PyTorch Implementation of DQN

Result

OpenAI defines CartPole as solved "when the average reward is greater than or equal to 195.0 over 100 consecutive trials."

Hyperparameters Used

gamma = 0.99

train_freq = 1 (step)

start_learning = 10

memory_size = 1000000

batch_size = 32

reset_every = 10 (terminated episode)

epsilon = 1

epsilon_minimum = 0.05

epsilon_decay_rate = 0.9999

learning_rate = 0.001

Tips for Debugging

If you're training your agent with CartPole, set the total episode at a minimum amount (say, 1), then print out all the variables to check if your variables are storing the correct information. If your environment is taking image input (e.g., you're training on Breakout), change the environment to CartPole first (since it's easier to interpret CartPole's observations and outputs).
A nice reference to interpret state observation and your agent's actions: OpenAI's documentation.
After checking each variable, see if your program flow is correct. This can be done by inspecting your variables or compare with other working code.
If you're sure about variable content and program flow working correctly - congratulations! Adjust your hyperparameters and you'll get a rough sense on how each hyperparameter invokes changes in performance.
One key factor of me judging whether my agent is learning effectively is the Q value of each state. At the beginning the Q values might be quite different (since it depends on network initialization), with one action strictly dominate the other. As time goes on, the difference starts to shrink and the performance takes off when the Q values of actions get close to each other. Similarly, try to scrutinize the actual values of your key variables, and you might get insights from it :).

Original Paper

Playing Atari with Deep Reinforcement Learning

Name		Name	Last commit message	Last commit date
Latest commit History 16 Commits
__pycache__		__pycache__
.gitignore		.gitignore
README.md		README.md
agent.py		agent.py
config.py		config.py
main.py		main.py
memory.py		memory.py
network.py		network.py
ops.py		ops.py
score.png		score.png

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

PyTorch Implementation of DQN

Result

Hyperparameters Used

Tips for Debugging

Original Paper

About

Uh oh!

Releases

Packages

Languages

RPC2/DQN_PyTorch

Folders and files

Latest commit

History

Repository files navigation

PyTorch Implementation of DQN

Result

Hyperparameters Used

Tips for Debugging

Original Paper

About

Topics

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages