README FILE
Author: Jianyuan (Jet) Yu
Affiliation: Wireless, ECE, Virginia Tech
Email : jianyuan@vt.edu
Date : April, 2018

Bibliography sum up of the Deep Reinforcement Learning on Dynamic Channel Access Project.

Related Files

bibliography.bib for conference or journal writing.
Illustration Graphs for conference or journal writing.
Equations, Algorithms & Tables for conference or journal writing.

General Survey

Arulkumaran, Kai, et al. "A brief survey of deep reinforcement learning." arXiv preprint arXiv:1708.05866 (2017).
Luong, Nguyen Cong, et al. "Applications of Deep Reinforcement Learning in Communications and Networking: A Survey." arXiv preprint arXiv:1810.07862 (2018).
- BibTex
- [luong2018applications]
- (+) cover many other work besides dynamic channel access, such as rate control, cache, offload and security, part of them could be our further work.
- (+) cover pratical details such as multi-agent
Chen, Mingzhe, et al. "Machine learning for wireless networks with artificial intelligence: A tutorial on neural networks." arXiv preprint arXiv:1710.02913 (2017).
- list liquid state machine / echo state machine
- (-) skip DQN

Classic Method

myopic

Zhao, Qing, Bhaskar Krishnamachari, and Keqin Liu. "On myopic sensing for multi-channel opportunistic access: structure, optimality, and performance." IEEE Transactions on Wireless Communications 7.12 (2008).
- (+) achieve 66.7% rate when coexist with stochastic channel, the Gilber-Elliot/ 2-state Makov Chain Model.
- round-robin
- (+) require p_i,j preknown; p11 >= p01 all channel; p11 < p01 channel 2 or 3.
- (-) limited, work channel case.
- [zhao2008myopic]
Ahmad, Sahand Haji Ali, et al. "Optimality of myopic sensing in multichannel opportunistic access." IEEE Transactions on Information Theory 55.9 (2009): 4040-4050
- [ahmad2009optimality]

Whittle

Liu, Keqin, and Qing Zhao. "Indexability of restless bandit problems and optimality of whittle index for dynamic multichannel access." IEEE Transactions on Information Theory 56.11 (2010): 5547-5567.
- [liu2010indexability]
- 45-page version

MDP

Zhang, Yalin, et al. "Model free dynamic sensing order selection for imperfect sensing multichannel cognitive radio networks: A Q-learning approach." Communication Systems (ICCS), 2014 IEEE International Conference on. IEEE, 2014.
- (+) imprefect sensing analyze
- (+) Q-learning
- (-) sense-tx procedure is not real learning

Recent peer's work

Wang, Shangxing, et al. "Deep reinforcement learning for dynamic multichannel access in wireless networks." IEEE Transactions on Cognitive Communications and Networking (2018).
- first journal, IEEE TRANSACTIONS ON COGNITIVE COMMUNICATIONS AND NETWORKING, VOL. 4, NO. 2, JUNE 2018, First paper apply DQN on channel access
  - stochastic over channel -> stochastic-hopping
  - dynamic environment -> auto detect and re-learn
  - stack Memory DQN
  - sync bettween pair of DQN -> emmergency channel
  - (-) sense first
- USC
- [wang2018deep]
- 30 page version
Chang, Hao-Hsuan, et al. "Distributive Dynamic Spectrum Access through Deep Reinforcement Learning: A Reservoir Computing Based Approach." IEEE Internet of Things Journal (2018).
- Lingjia Liu's work, apply Echo State Machine , one type of RNN
- DQN + RC > DQN + MLP, same performance, better convergecast
- (-) sense first
- [chang2018distributive]
Yu, Yiding, Taotao Wang, and Soung Chang Liew. "Deep-reinforcement learning multiple access for heterogeneous wireless networks." 2018 IEEE International Conference on Communications (ICC). IEEE, 2018.
- Time slot aceess
- [yu2017deep]
Tsiligkaridis, Theodoros, and David Romero. "Accelerated Reinforcement Learning Algorithms with Nonparametric Function Approximation for Opportunistic Spectrum Access." arXiv preprint arXiv:1706.04546 (2017).
- MIT
- (+)reduce state space in a math way, rather than neural network
- [tsiligkaridis2017accelerated]
Naparstek, Oshri, and Kobi Cohen. "Deep multi-user reinforcement learning for dynamic spectrum access in multichannel wireless networks." arXiv preprint arXiv 1704 (2017).
- (+) first handle multi-agent learning, first implement DRQN + DoubleDQN in LSTM, where author treat ~~distribute observation as the partial observation~~.
- (-) poor verification
- 30 page version
- [naparstek2017deep]

After July 2018

Lu, Xiaozhen, Liang Xiao, and Canhuang Dai. "UAV-Aided 5G Communications with Deep Reinforcement Learning Against Jamming." arXiv preprint arXiv:1805.06628 (2018).
- BibTex
- [lu2018uav]
- (+) apply Tranfer Learning to fast initial CNN
- (+) claim to convergecast in 200 steps.
- (-) lack technic details
Wang, Leye, et al. "Cell Selection with Deep Reinforcement Learning in Sparse Mobile Crowdsensing." arXiv preprint arXiv:1804.07047 (2018).
- [wang2018cell]
- BibTex
- (+) apply DRQN to solver partial observation
- (+) transfer learning

DQN Family & Dr. Silver work

Sutton TextBook

Sutton, Richard S., Andrew G. Barto, and Francis Bach. Reinforcement learning: An introduction. MIT press, 1998.
BibTex

DQN Family

Method	Author	Afflicate	comment	Bibtex	paper	abbreviation	openSource
DQN	Mnih	Google DeepMind	-	BibTex	paper	[mnih2015human]	DQN
Double DQN	Van Hasselt	Google DeepMind	-	BibTex	paper	[van2016deep]	Double DQN
Prioritized DQN	Tom Schaul	Google DeepMind	-	BibTex	paper	[schaul2015prioritized]	Pri DQN
Dueling DQN	Wang, Ziyu	Google DeepMind	-	BibTex	paper	[wang2015dueling]	Duel DQN
Asynchronous DQN	Mnih	Google DeepMind	Asynchronous Advantage Actor Critic (A3C) + RNN with continuous action space	BibTex	paper	mnih2016asynchronous]	Asyn DQN
Distributional DQN	Marc G. Bellemare	Google DeepMind	-	BibTex	paper	[wang2015dueling]
Noisy Nets DQL	Meire Fortunato	Google DeepMind	-	BibTex	paper	[wang2015dueling]
Rainbow DQN	Matteo Hessel	Google DeepMind	-	BibTex	paper	[hessel2017rainbow]
Deep Deterministic Policy Gradient (DDPG)	David Silver	Google DeepMind	-	BibTex	paper	[silver2014deterministic]	DDPG
Distributed Proximal Policy Optimization (DPPO)	John Schulman	OpenAI	-	BibTex	paper	[schulman2017proximal]	DDPO

Large Action Size

Dulac-Arnold, Gabriel, et al. "Deep reinforcement learning in large discrete action spaces." arXiv preprint arXiv:1512.07679 (2015).

Wolpertinger architecture(similiar to actor-critic)
deal with large action-space, ~1 M action

POMDP

Hausknecht, Matthew, and Peter Stone. "Deep recurrent q-learning for partially observable mdps." CoRR, abs/1507.06527 7.1 (2015).
- DRQN
- BibTex
- [hausknecht2015deep]
Silver, David, and Joel Veness. "Monte-Carlo planning in large POMDPs." Advances in neural information processing systems. 2010. - by David Silver, MIT, 2010.
- POMDP
- BibTex
- [silver2010monte]

How to cite a Github open source

morvan's github for DQN famlily

@misc{Mofan2013,
  author = {Mofan Zhou},
  title = {Reinforcement-learning-with-tensorflow},
  year = {2016},
  publisher = {GitHub},
  journal = {GitHub repository},
  howpublished = {\url{https://github.com/MorvanZhou/Reinforcement-learning-with-tensorflow}},
  commit = {81fea33905c7f81719ec031eab51c68225eb7cce}
}

how to make your codes citable

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

README_bibliography.md

README_bibliography.md

README FILE
Author: Jianyuan (Jet) Yu
Affiliation: Wireless, ECE, Virginia Tech
Email : [email protected]
Date : April, 2018

Related Files

General Survey

Classic Method

myopic

Whittle

MDP

Recent peer's work

DQN Family & Dr. Silver work

Sutton TextBook

DQN Family

Large Action Size

POMDP

How to cite a Github open source

Files

README_bibliography.md

Latest commit

History

README_bibliography.md

File metadata and controls

README FILE Author: Jianyuan (Jet) Yu Affiliation: Wireless, ECE, Virginia Tech Email : [email protected] Date : April, 2018

Related Files

General Survey

Classic Method

myopic

Whittle

MDP

Recent peer's work

DQN Family & Dr. Silver work

Sutton TextBook

DQN Family

Large Action Size

POMDP

How to cite a Github open source

README FILE
Author: Jianyuan (Jet) Yu
Affiliation: Wireless, ECE, Virginia Tech
Email : [email protected]
Date : April, 2018