Deep Reinforce Learning Algorithm with Blockchain Scaling

This is a graduation project implementation, in which we give 2 optimization models, namely Double DQN and PPO based on previous work. We also define the environment for DRL in "./eth_optimize.py".

DRL Env

we focus on optimize the tps and latency of ethereum (clque network) with DRL. So we first clearify the base information of this RL env as in eth_optimize.py. More details are written in ethOptimize.ipynb To use our env, you can call code as followed.

from eth_optimize import EthOptimize
env = EthOptimize()

DRL model implementation

1. DQN

in myDQN.py we simply build a DQN model and validate it with env=gym.make("lunar-landerv2") which shows a nice result. Then comes to DDQN

2. Double DQN

in DDQN.py,we gives the DDQN model and we use it in our env and get a reward convergence. in our env, 2 reward functions are displayed. one is simple and the other is complex. both func shows a nice result. for simple one, EP10000 and EP20000 results are down beneath.

for a complex one, EP20000 is showed.

3. PPO(Proximal Policy Optimization)

in PPO.py gives a PPO model on discrete action space. To train PPO we use trainPPO.py, and also get a great result.

Compare

To compare the model we use, we also use trainBoth.py to plot rewards comparison in 1 pic.

green one is DDQN and red one is PPO. Clearly PPO has a faster convergence speed than DDQN, showing its better performance in our env.

Validate

1. reward figure

we introduce the baseline of a system (w/o any DRL algorithm), and we compare the rewards of 3 pots. For complex one, the fig as

Simply shows that our model has a positive influence on system.

2. evaluation

To validate that our models do have a optimization trend, we use simple reward function as Evaluation Criteria and here is the result.

loose constraint

tight constraint

further evaluation criteria analysis in eval_criteria_analyse.py. In this part, we assume that everything have been done and get a csv file storing data. Than we utilize these data plot and analyse.

What can be done?

smooth the curve with colve func. for example, we first take windowsize as 500, and result is below

test the convergence value of each model, and calculate the optimization percentage. we show the convergence level in pic below. ppo gives a 27.76% higher and ddqn gives 16.84% higher.

The project is still in progress. To be continued.

@copyright: Ryan Bq zhao
@author: ryanhigh
contact me: [email protected]

Name		Name	Last commit message	Last commit date
Latest commit History 25 Commits
.idea		.idea
.ipynb_checkpoints		.ipynb_checkpoints
PPO		PPO
Val_n_base		Val_n_base
__pycache__		__pycache__
checkpoints		checkpoints
eth		eth
figure		figure
test		test
test2		test2
validate		validate
.DS_Store		.DS_Store
DDQN.py		DDQN.py
LICENSE		LICENSE
PPO.py		PPO.py
README.md		README.md
all.csv		all.csv
buffer.py		buffer.py
clean_data.py		clean_data.py
ethOptimize.ipynb		ethOptimize.ipynb
eth_optimize.py		eth_optimize.py
eval_criteria_analyse.py		eval_criteria_analyse.py
myDQN.py		myDQN.py
result2.csv		result2.csv
train.py		train.py
trainBoth.py		trainBoth.py
trainDQN.py		trainDQN.py
trainPPO.py		trainPPO.py
utils.py		utils.py
validate.py		validate.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Deep Reinforce Learning Algorithm with Blockchain Scaling

DRL Env

DRL model implementation

1. DQN

2. Double DQN

3. PPO(Proximal Policy Optimization)

Compare

Validate

1. reward figure

2. evaluation

What can be done?

The project is still in progress. To be continued.

About

Releases

Packages

Languages

License

ryanhigh/Double-DQN

Folders and files

Latest commit

History

Repository files navigation

Deep Reinforce Learning Algorithm with Blockchain Scaling

DRL Env

DRL model implementation

1. DQN

2. Double DQN

3. PPO(Proximal Policy Optimization)

Compare

Validate

1. reward figure

2. evaluation

What can be done?

The project is still in progress. To be continued.

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages