动手学强化学习

gymnasium==0.29.1

https://www.zhihu.com/people/zoujiu1

https://zoujiu.blog.csdn.net/

Chapter 7 DQN Algorithm

blogs

https://zhuanlan.zhihu.com/p/656515516 动手学强化学习reinforcement learning-chapter-seven-DQN algorithm，阅读的笔记

Numpy implementation of Chapter 7 DQN Algorithm

difference: self.epsilon = self.epsilon * 0.996 if self.epsilon > 0.0001 else 0.0001, network with numpy and so on.

run

python ./numpy_RL_reinforcement_learning/chapter7_DQN.py
python ./chapter7.py

Chapter 8 DQN改进以后的算法

blogs

https://zhuanlan.zhihu.com/p/656614302 动手学强化学习reinforcement learning-chapter-eight-DQN改进 algorithm，阅读的笔记

Numpy implementation of Chapter 8 DQN improvement Algorithm

run

python ./numpy_RL_reinforcement_learning/chapter8.py
python ./chapter8.py

Chapter 9 策略梯度算法

blogs

https://zhuanlan.zhihu.com/p/656835865 动手学强化学习reinforcement learning-chapter-nine-策略梯度算法，阅读的笔记

Numpy implementation of Chapter 9 策略梯度算法

difference: self.epsilon = self.epsilon * 0.996 if self.epsilon > 0.0001 else 0.0001, network with numpy and so on.

run

python ./numpy_RL_reinforcement_learning/chapter9_策略梯度算法.py
python ./chapter9.py

Chapter 10 Actor-Critic 算法

blogs

https://zhuanlan.zhihu.com/p/657087227 动手学强化学习reinforcement learning-chapter-ten-Actor-Critic algorithm，阅读的笔记

Numpy implementation of Chapter 10 Actor-Critic 算法

run

python ./numpy_RL_reinforcement_learning/chapter10_Actor-Critic算法.py
python ./chapter10.py

Chapter 11 TRPO算法

blogs

https://zhuanlan.zhihu.com/p/657548225 动手学强化学习reinforcement learning-chapter-eleven-TRPO algorithm，阅读的笔记

Chapter 12 PPO算法

blogs

https://zhuanlan.zhihu.com/p/658299943 动手学强化学习reinforcement learning-chapter-twelve-PPO algorithm，阅读的笔记

Numpy implementation of Chapter 12 PPO算法

run

python ./numpy_RL_reinforcement_learning/chapter12.py
python ./chapter12.py

Chapter 13 DDPG算法

blogs

https://zhuanlan.zhihu.com/p/658460643 动手学强化学习reinforcement learning-chapter-thirteen-DDPG 算法，阅读的笔记

Numpy implementation of Chapter 13 DDPG算法

run

python ./numpy_RL_reinforcement_learning/chapter13.py
python ./chapter13.py

Chapter 1four SAC算法

&&&&&&&

blogs

https://zhuanlan.zhihu.com/p/658560149 动手学强化学习reinforcement learning-chapter-fourteen-SAC algorithm，阅读的笔记

Chapter 15 imitation learning模仿

blogs

https://zhuanlan.zhihu.com/p/658591567 动手学强化学习reinforcement learning-chapter-fifteen-imitation learning模仿

Numpy implementation of Chapter 15 imitation learning模仿

run

python ./numpy_RL_reinforcement_learning/chapter15.py
python ./chapter15.py

Chapter sixteen-模型预测控制

blogs

https://zhuanlan.zhihu.com/p/658777952? 动手学强化学习reinforcement learning-chapter-sixteen-模型预测控制

Chapter seventeen-基于模型的策略优化

blogs

https://zhuanlan.zhihu.com/p/658972982? 动手学强化学习reinforcement learning-chapter-seventeen-基于模型的策略优化

Chapter eighteen-离线强化学习

blogs

https://zhuanlan.zhihu.com/p/659029814 动手学强化学习reinforcement learning-chapter-eighteen-离线强化学习

Chapter nineteen-目标导向的强化学习

blogs

https://zhuanlan.zhihu.com/p/659106430 动手学强化学习reinforcement learning-chapter-nineteen-目标导向的强化学习

Numpy implementation of Chapter nineteen-目标导向的强化学习

run

python ./numpy_RL_reinforcement_learning/chapter19.py
python ./chapter19.py

Chapter twenty-多智能体强化学习入门

blogs

https://zhuanlan.zhihu.com/p/659106430 动手学强化学习reinforcement learning-chapter-nineteen-目标导向的强化学习

Numpy implementation of Chapter twenty-多智能体强化学习入门

run

python ./numpy_RL_reinforcement_learning/chapter20.py
python ./chapter20.py

Chapter twenty-one 多智能体强化学习进阶

blogs

https://zhuanlan.zhihu.com/p/659244178 动手学强化学习reinforcement learning-chapter-twenty-one-多智能体强化学习进阶

-------------------------------------original readme---------------------------------

Tips: 若运行gym环境的代码时遇到报错，请尝试pip install gym==0.18.3安装此版本的gym库，若仍有问题，欢迎提交issue！

欢迎来到《动手学强化学习》（Hands-on Reinforcement Learning）的地带。该系列从强化学习的定义等基础讲起，一步步由浅入深，介绍目前一些主流的强化学习算法。每一章内容都是一个Jupyter Notebook，内含详细的图文介绍和代码讲解。

由于GitHub上渲染notebook效果有限，我们推荐读者前往Hands-on RL主页进行浏览，我们在此提供了纯代码版本的notebook，供大家下载运行。
欢迎在京东和当当网购买《动手学强化学习》。
如果你发现了本书的任何问题，或者有任何改善建议的，欢迎提交issue！
本书配套的强化学习课程已上线到伯禹学习平台，所有人都可以免费学习和讨论。

Name		Name	Last commit message	Last commit date
Latest commit History 46 Commits
numpy_RL_reinforcement_learning		numpy_RL_reinforcement_learning
LICENSE		LICENSE
README.md		README.md
chapter10.gif		chapter10.gif
chapter10.py		chapter10.py
chapter11.py		chapter11.py
chapter11CartPole.gif		chapter11CartPole.gif
chapter11Pendulum.gif		chapter11Pendulum.gif
chapter12.py		chapter12.py
chapter12CartPole.gif		chapter12CartPole.gif
chapter12Pendulum.gif		chapter12Pendulum.gif
chapter13.gif		chapter13.gif
chapter13.py		chapter13.py
chapter15.gif		chapter15.gif
chapter15.py		chapter15.py
chapter15GAIL.gif		chapter15GAIL.gif
chapter16.gif		chapter16.gif
chapter16.py		chapter16.py
chapter18.gif		chapter18.gif
chapter18.py		chapter18.py
chapter19.py		chapter19.py
chapter19NO_HER.gif		chapter19NO_HER.gif
chapter19_HER.gif		chapter19_HER.gif
chapter1four.gif		chapter1four.gif
chapter1four.py		chapter1four.py
chapter1fourCartPole.gif		chapter1fourCartPole.gif
chapter2.py		chapter2.py
chapter20.gif		chapter20.gif
chapter20.py		chapter20.py
chapter21.gif		chapter21.gif
chapter21.py		chapter21.py
chapter21__.gif		chapter21__.gif
chapter21__.py		chapter21__.py
chapter3.py		chapter3.py
chapter7.gif		chapter7.gif
chapter7.py		chapter7.py
chapter8.py		chapter8.py
chapter8_0.gif		chapter8_0.gif
chapter8_1.gif		chapter8_1.gif
chapter8_2.gif		chapter8_2.gif
chapter9.gif		chapter9.gif
chapter9.py		chapter9.py
chapter_seventeen.py		chapter_seventeen.py
chapterseventeen.gif		chapterseventeen.gif
kkkkkk.py		kkkkkk.py
rl_utils.py		rl_utils.py
第10章-Actor-Critic算法.ipynb		第10章-Actor-Critic算法.ipynb
第11章-TRPO算法.ipynb		第11章-TRPO算法.ipynb
第12章-PPO算法.ipynb		第12章-PPO算法.ipynb
第13章-DDPG算法.ipynb		第13章-DDPG算法.ipynb
第15章-模仿学习.ipynb		第15章-模仿学习.ipynb
第16章-模型预测控制.ipynb		第16章-模型预测控制.ipynb
第18章-离线强化学习.ipynb		第18章-离线强化学习.ipynb
第19章-目标导向的强化学习.ipynb		第19章-目标导向的强化学习.ipynb
第1four章-SAC算法.ipynb		第1four章-SAC算法.ipynb
第20章-多智能体强化学习入门.ipynb		第20章-多智能体强化学习入门.ipynb
第21章-多智能体强化学习进阶.ipynb		第21章-多智能体强化学习进阶.ipynb
第2章-多臂老虎机问题.ipynb		第2章-多臂老虎机问题.ipynb
第3章-马尔可夫决策过程.ipynb		第3章-马尔可夫决策过程.ipynb
第4章-动态规划算法.ipynb		第4章-动态规划算法.ipynb
第5章-时序差分算法.ipynb		第5章-时序差分算法.ipynb
第6章-Dyna-Q算法.ipynb		第6章-Dyna-Q算法.ipynb
第7章-DQN算法.ipynb		第7章-DQN算法.ipynb
第8章-DQN改进算法.ipynb		第8章-DQN改进算法.ipynb
第9章-策略梯度算法.ipynb		第9章-策略梯度算法.ipynb
第seventeen章-基于模型的策略优化.ipynb		第seventeen章-基于模型的策略优化.ipynb

License

ZouJiu1/Hands-on-RL

Folders and files

Latest commit

History

Repository files navigation

动手学强化学习

Chapter 7 DQN Algorithm

blogs

Numpy implementation of Chapter 7 DQN Algorithm

Chapter 8 DQN改进以后的算法

blogs

Numpy implementation of Chapter 8 DQN improvement Algorithm

Chapter 9 策略梯度算法

blogs

Numpy implementation of Chapter 9 策略梯度算法

Chapter 10 Actor-Critic 算法

blogs

Numpy implementation of Chapter 10 Actor-Critic 算法

Chapter 11 TRPO算法

blogs

Chapter 12 PPO算法

blogs

Numpy implementation of Chapter 12 PPO算法

Chapter 13 DDPG算法

blogs

Numpy implementation of Chapter 13 DDPG算法

Chapter 1four SAC算法

blogs

Chapter 15 imitation learning模仿

blogs

Numpy implementation of Chapter 15 imitation learning模仿

Chapter sixteen-模型预测控制

blogs

Chapter seventeen-基于模型的策略优化

blogs

Chapter eighteen-离线强化学习

blogs

Chapter nineteen-目标导向的强化学习

blogs

Numpy implementation of Chapter nineteen-目标导向的强化学习

Chapter twenty-多智能体强化学习入门

blogs

Numpy implementation of Chapter twenty-多智能体强化学习入门

Chapter twenty-one 多智能体强化学习进阶

blogs

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages