Explicitly show the relationships between various techniques of deep reinforcement learning methods. Dedicated for learning and researching on DRL. This project is for learning and researching on DRL. This area is so hot that everyday we can see new ideas happen. I would like to give an explicit landscape for deep rl, one reason is for aquire the better understanding of existing methods and theoretical results, the other is to seek potential developments based on these findings. Any suggestion/improvement is welcomed.
Recommendations and suggestions are welcome.
- TreeQN and ATreeC: Differentiable Tree-Structured Models for Deep Reinforcement Learning 8 Mar 2018
- DISTRIBUTED PRIORITIZED EXPERIENCE REPLAY 2 Mar 2018
- Rainbow: Combining Improvements in Deep Reinforcement Learning 6 Oct 2017
- Learning from Demonstrations for Real World Reinforcement Learning 12 Apr 2017
- Dueling Network Architecture
- Double DQN
- Prioritized Experience
- Deep Q-Networks
- Expected Policy Gradients for Reinforcement Learning 10 Jan 2018
- Proximal Policy Optimization Algorithms 20 July 2017
- Emergence of Locomotion Behaviours in Rich Environments 7 July 2017
- Interpolated Policy Gradient: Merging On-Policy and Off-Policy Gradient Estimation for Deep Reinforcement Learning 1 Jun 2017
- Equivalence Between Policy Gradients and Soft Q-Learning
- Trust Region Policy Optimization
- Reinforcement Learning with Deep Energy-Based Policies
- Q-PROP: SAMPLE-EFFICIENT POLICY GRADIENT WITH AN OFF-POLICY CRITIC
- The Uncertainty Bellman Equation and Exploration 15 Sep 2017
- Noisy Networks for Exploration 30 Jun 2017 implementation
- Count-Based Exploration in Feature Space for Reinforcement Learning 25 Jun 2017
- Count-Based Exploration with Neural Density Models 14 Jun 2017
- UCB and InfoGain Exploration via Q-Ensembles 11 Jun 2017
- Minimax Regret Bounds for Reinforcement Learning 16 Mar 2017
- Incentivizing Exploration In Reinforcement Learning With Deep Predictive Models
- EX2: Exploration with Exemplar Models for Deep Reinforcement Learning
- The Reactor: A Sample-Efficient Actor-Critic Architecture 15 Apr 2017
- SAMPLE EFFICIENT ACTOR-CRITIC WITH EXPERIENCE REPLAY
- REINFORCEMENT LEARNING WITH UNSUPERVISED AUXILIARY TASKS
- Continuous control with deep reinforcement learning
- Model-Based Stabilisation of Deep Reinforcement Learning 6 Sep 2018
- Learning model-based planning from scratch 19 July 2017
- Variational Option Discovery Algorithms 26 July 2018
- A Laplacian Framework for Option Discovery in Reinforcement Learning 16 Jun 2017
- Robust Imitation of Diverse Behaviors
- Learning human behaviors from motion capture by adversarial imitation
- Connecting Generative Adversarial Networks and Actor-Critic Methods
- Bridging the Gap Between Value and Policy Based Reinforcement Learning
- Policy gradient and Q-learning
- Continuous Adaptation via Meta-Learning in Nonstationary and Competitive Environments 10 Oct 2017
- Learning with Opponent-Learning Awareness 13 Sep 2017
- Counterfactual Multi-Agent Policy Gradients
- Multi-Agent Actor-Critic for Mixed Cooperative-Competitive Environments 7 Jun 2017
- Multiagent Bidirectionally-Coordinated Nets for Learning to Play StarCraft Combat Games 29 Mar 2017
- IMPALA: Scalable Distributed Deep-RL with Importance Weighted Actor-Learner Architectures 9 Feb 2018
- Reverse Curriculum Generation for Reinforcement Learning
- Trial without Error: Towards Safe Reinforcement Learning via Human Intervention
- Learning to Design Games: Strategic Environments in Deep Reinforcement Learning 5 July 2017
- Zero-Shot Task Generalization with Multi-Task Deep Reinforcement Learning 7 Nov 2017
- Distral: Robust Multitask Reinforcement Learning 13 July 2017
- Observational Learning by Reinforcement Learning 20 Jun 2017
- Implicit Quantile Networks for Distributional Reinforcement Learning 14 Jun 2018
- DISTRIBUTED DISTRIBUTIONAL DETERMINISTIC POLICY GRADIENTS 23 Apr 2018
- An Analysis of Categorical Distributional Reinforcement Learning 22 Feb 2018
- Distributional Reinforcement Learning with Quantile Regression 27 Oct 2017
- A Distributional Perspective on Reinforcement Learning 21 July 2017