Implementations of basic RL algorithms with minimal lines of codes! (PyTorch based)
-
Each algorithm is complete within a single file.
-
Length of each file is up to 100~150 lines of codes.
-
Every algorithm can be trained within 30 seconds, even without GPU.
-
Envs are fixed to "CartPole-v1". You can just focus on the implementations.
- REINFORCE (67 lines)
- Vanilla Actor-Critic (98 lines)
- DQN (112 lines, including replay memory and target network)
- PPO (119 lines, including GAE)
- DDPG (145 lines, including OU noise and soft target update)
- A3C (129 lines)
- ACER (149 lines)
- A2C (188 lines)
- SAC (171 lines) added!!
- PPO-Continuous (161 lines) added!!
- Vtrace (137 lines) added!!
- Any suggestion ...?
- PyTorch == 1.12.1
- OpenAI GYM == 0.25.2
- Numpy == 1.23.2
# Works only with Python 3.
# e.g.
python3 REINFORCE.py
python3 actor_critic.py
python3 dqn.py
python3 ppo.py
python3 ddpg.py
python3 a3c.py
python3 a2c.py
python3 acer.py
python3 sac.py