I found the current implementations of Reinforcement Learning Algorithms are somewhat complicated, which is hard to get start.
Here are some classical Reinforcement Learning Algorithms implemented by Pytorch. I tried to make them clean, robust, and unified, hoping to help you get start with RL quickly.
Now I have finished Q-learning, DQN, DDQN, PPO discrete, PPO continuous, TD3, SAC Continuous, SAC Discrete. I will implement more in the future.
Pong | Enduro |
---|---|
DQN: Mnih V , Kavukcuoglu K , Silver D , et al. Playing Atari with Deep Reinforcement Learning[J]. Computer Science, 2013.
Double DQN: Hasselt H V , Guez A , Silver D . Deep Reinforcement Learning with Double Q-learning[J]. Computer ence, 2015.
PPO: Proximal Policy Optimization Algorithms, Emergence of Locomotion Behaviours in Rich Environments
TD3: Fujimoto S , Hoof H V , Meger D . Addressing Function Approximation Error in Actor-Critic Methods[J]. 2018.
SAC:
Haarnoja T, Zhou A, Abbeel P, et al. Soft actor-critic: Off-policy maximum entropy deep reinforcement learning with a stochastic actor[C]//International conference on machine learning. PMLR, 2018: 1861-1870.
Soft Actor-Critic Algorithms and Applications
Christodoulou P. Soft actor-critic for discrete action settings[J]. arXiv preprint arXiv:1910.07207, 2019.