Skip to content

Commit

Permalink
Speed up tutorials
Browse files Browse the repository at this point in the history
  • Loading branch information
lupoglaz committed Nov 24, 2020
1 parent 88d8ff0 commit 570ab4f
Show file tree
Hide file tree
Showing 11 changed files with 179 additions and 38 deletions.
6 changes: 0 additions & 6 deletions GodotModule/cSharedMemory.h
Original file line number Diff line number Diff line change
Expand Up @@ -32,12 +32,6 @@ typedef allocator<int, managed_shared_memory::segment_manager> ShmemAllocator;
typedef std::vector<int, ShmemAllocator> IntVector;
typedef std::vector<float, ShmemAllocator> FloatVector;


struct TensorDescription{
std::string type; //Tensor scalar type
TensorDescription(const std::string &t_type):type(t_type){}
};

class cSharedMemory : public Reference {
GDCLASS(cSharedMemory, Reference);

Expand Down
5 changes: 3 additions & 2 deletions Tutorials/InvPendulumTut/InvPendulum/Environment.gd
Original file line number Diff line number Diff line change
Expand Up @@ -7,7 +7,6 @@ var sem_action
var sem_observation
var sem_reset
var mem
#onready var policy_data = preload("res://ddpg_policy.jit")
onready var policy_data = load("res://ddpg_policy.tres")
var policy
var policy_action
Expand All @@ -34,6 +33,7 @@ func _ready():
sem_observation.init("sem_observation")
print("Running as OpenAIGym environment")
else:
#pass
policy = cTorchModel.new()
policy.set_data(policy_data)

Expand Down Expand Up @@ -88,7 +88,7 @@ func _physics_process(delta):
env_action[1] = 1
if policy_action != null:
agent_action = policy_action
agent_action[0]*=8.0
agent_action[0]*=8.0

$ActionLabel.text = "Action: "+str(agent_action)

Expand Down Expand Up @@ -116,6 +116,7 @@ func _on_Timer_timeout():
mem.sendIntArray("done", [is_done()])
sem_observation.post()
else:
#pass
policy_action = policy.run(observation)

time_elapsed += deltat
Expand Down
Binary file modified docs/Fig/ExportTemplates.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added docs/Fig/PhysicsSettings.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added docs/Fig/ServerTemplates.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added docs/Fig/SpeedUpPic.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added docs/Fig/VsyncSettings.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
12 changes: 9 additions & 3 deletions docs/index.html
Original file line number Diff line number Diff line change
Expand Up @@ -36,7 +36,7 @@
<a class="nav-link" href="API.html">API<span class="sr-only"></span></a>
</li>
<li class="nav-item active">
<a class="nav-link" href="tutorial.html">Tutorial<span class="sr-only"></span></a>
<a class="nav-link" href="tutorial_basic.html">Tutorial<span class="sr-only"></span></a>
</li>
</ul>
</div>
Expand All @@ -61,12 +61,18 @@ <h2>API</h2>
<img class="rounded-circle" src="Fig/Process.png" alt="Generic placeholder image" width="140" height="140">
<h2>Basic Tutorial</h2>
<p>This tutorial guides through the basics of setting up an environment. It also discusses the way godot and python processes communicate.</p>
<p><a class="btn btn-secondary" href="tutorial.html" role="button">View &raquo;</a></p>
<p><a class="btn btn-secondary" href="tutorial_basic.html" role="button">View &raquo;</a></p>
</div><!-- /.col-lg-4 -->
<div class="col-lg-4">
<img class="rounded-circle" src="Fig/SpeedUpPic.png" alt="Generic placeholder image" width="140" height="140">
<h2>Speedup Tutorial</h2>
<p>Guides through the process of executing godot process faster than real time.
<p><a class="btn btn-secondary" href="tutorial_speedup.html" role="button">View &raquo;</a></p>
</div><!-- /.col-lg-4 -->
<div class="col-lg-4">
<img class="rounded-circle" src="" alt="Generic placeholder image" width="140" height="140">
<h2>Training Tutorial</h2>
<p>Guides through the process of training your model in pytorch. <br> Work in progress...</p>
<p>Training your model using DDPG. <br> Work in progress...</p>
<p><a class="btn btn-secondary disabled" href="#" role="button">View &raquo;</a></p>
</div><!-- /.col-lg-4 -->
<div class="col-lg-4">
Expand Down
41 changes: 17 additions & 24 deletions docs/tutorial.html → docs/tutorial_basic.html
Original file line number Diff line number Diff line change
Expand Up @@ -53,21 +53,21 @@ <h1>Tutorial</h1>
<div class="col">
<h2>Introduction</h2>
When training reinforcement learning agents, the agent interacts with the environment by sending actions and receiving observations.
The agents are trained in the python script and the environment is implemented using Godot. <br>
The agents are trained in a python script and the environment is implemented using Godot. <br>
In python the environment is wrapped into a class, that is usually similar to OpenAI Gym environment class (Code 1). We need to implement
the functions: <b>init</b>, <b>step</b>, <b>reset</b> and <b>close</b> to get fully functional environment. <br><br>
The <b>init</b> function launches subprocesses associated with your environment. In our case it launches Godot
project as a subprocess.<br>
The <b>step</b> function takes an action tensor that was sampled from your model and passes it to the environment. It then computes
the next state and returns the observation of this state along with the reward. The variable <b>done</b> is <b>true</b> if the new state is the
final one, <b>true</b> otherwise.<br>
The <b>reset</b> function returns the environment to the initial state and return the observation of this state.<br>
The <b>close</b> function closes any subprocesses associated with your environment.<br>
final one, <b>false</b> otherwise.<br>
The <b>reset</b> function returns the environment to the initial state and returns the observation of this state.<br>
The <b>close</b> function closes all subprocesses associated with your environment.<br>
We will ignore <b>seed</b> and <b>render</b> functions for now, because we use random number generator in Godot
and rendering is done by default.<br>
</div>
<div class="col">
<h5>Code 1: Dummy implementation of OpenAI Gym class.</h5>
<h5>Code 1: Dummy implementation of an OpenAI Gym class.</h5>
<pre class="pre"><code class="python">
class DummyEnv(gym.Env):
def __init__(self):
Expand Down Expand Up @@ -135,8 +135,8 @@ <h2>Synchronization in Python</h2>
When the <b>step</b> function is called the action tensor is sent into the shared memory and <b>sem_act</b> semaphore
turns green. It signals to godot to start processing the action of an agent. Meanwhile the semaphore <b>sem_obs</b> turns
red, blocking python process while waiting for godot to send the observation.<br>
In Godot we have to pay additional attention to server process to make sure physics server is synchronized with the
semaphores.
In Godot we have to put synchronization procedures in a <b>_physics_process</b> if we have physics-based environment. In other cases
we can use standard <b>_process</b> function.
</div>
<div class="col">
<h5>Code 3: Introducing semaphores.</h5>
Expand Down Expand Up @@ -192,12 +192,7 @@ <h2>Synchronization in Godot</h2>
The engine waits for the actions in the function <b>_physics_process</b> and starts <b>Timer</b> count down. <br>
When the <b>Timer</b> returns the signal <b>_on_Timer_timeout</b> we send the observation back to python and release the
semaphore <b>sem_obs</b>. Also, we signal to the engine, that we wait for the next action by setting <b>timeout</b> variable to true.
<br><br>
We generally want the training to be as fast as possible, so we need to make the engine process faster than real time. It is done
by setting the <b>iterations_per_second</b> property of the engine to the FPS and increasing the timescale to some fraction of
the <b>iterations_per_second</b>. We use only fraction of the FPS, because we want the engine to
perform with reasonably low delta, such that
there are no obvious bugs in the physics. I can not guarantee that this way of increasing the processing speed is valid though.
<br>
</div>
<div class="col">
<h5>Code 4: Environment node script</h5>
Expand All @@ -214,10 +209,7 @@ <h5>Code 4: Environment node script</h5>
set_physics_process(true)

func _physics_process(delta):
if timeout:
Engine.iterations_per_second = max(60, Engine.get_frames_per_second())
Engine.time_scale = max(1.0, Engine.iterations_per_second/100.0)

if timeout:
if mem.exists():
sem_action.wait()
agent_action = mem.getIntArray("agent_action")
Expand All @@ -244,11 +236,11 @@ <h2>Resetting the environment</h2>
When the episode of the simulation ends we want to reset the environment to the initial position without relaunching
the engine process. To accomplish this feat we use additional action, that is 1 if the environment should be reset and
0 otherwise. However, the tricky part is resetting the positions and velocities of the objects in Godot.<br>
In our case we have one physics object: <b>RigidBody2D</b>. The positions and velicities of the <b>RigidBody2D</b> node can be changed
In our case we have one physics object: <b>RigidBody2D</b>. The positions and velocities of the <b>RigidBody2D</b> node can be changed
only in the <b>_integrate_forces</b> function of this node. Therefore we introduce the variable <b>reset</b>, which is true when we
want to reinitialize this node and false otherwise. Additionally we create variables to store initial positions and velocities of this node.<br>
The Code 5 shows the gist of this function. First we compute absolute initial <b>Transform2D</b> of the node and change the state accordingly.<br>
Unfortunatelly, we have to compute the transforms of <b>Anchor</b> and <b>PinJoint2D</b> tranforms in the <b>Environment</b> script (Code 6).
Unfortunatelly, we have to compute the transforms of <b>Anchor</b> and <b>PinJoint2D</b> in the <b>Environment</b> script (Code 6).
Probably, one can avoid this cumbersome procedure by reorganizing the tree or accessing the parent nodes from the <b>RigidBody2D</b> itself.
</div>
<div class="col">
Expand Down Expand Up @@ -303,9 +295,9 @@ <h5>Code 7: Launching the exported environment</h5>
class DummyEnvLaunch(gym.Env):
def __init__(self, exec_path, env_path):
with open("stdout.txt","wb") as out, open("stderr.txt","wb") as err:
self.process = subprocess.Popen([exec_path,
"--path", os.path.abspath(env_path),
"--handle", "environment"], stdout=out, stderr=err)
self.process = subprocess.Popen([exec_path, "--path",
os.path.abspath(env_path),
"--handle", self.handle], stdout=out, stderr=err)

atexit.register(self.close)

Expand All @@ -319,9 +311,10 @@ <h5>Code 7: Launching the exported environment</h5>
</div>
<div class="col">
<h2>Conclusion</h2>
In this tutorial we showed the main steps to make your own environment in Godot Engine and use it from python script.
In this tutorial we showed the first step to make your own environment in Godot Engine and use it from python script.
Right now the godot environment runs in the real time. In the next tutorial we will show how to speed it up.
The fully functional code for this tutorial can be found here: <br><br>
<a href="https://github.com/lupoglaz/GodotGymAI/tree/master/Environments">Environments</a><br><br>
<a href="https://github.com/lupoglaz/GodotGymAI/tree/master/Tutorials/InvPendulumTut">Tutorials</a><br><br>
The Godot project is located in the directory <b>InvPendulum</b> and the python Gym class is in the file <b>InvPendulum.py</b>.

<br>
Expand Down
145 changes: 145 additions & 0 deletions docs/tutorial_speedup.html
Original file line number Diff line number Diff line change
@@ -0,0 +1,145 @@
<!doctype html>
<html lang="en">
<head>
<meta charset="utf-8">
<meta name="viewport" content="width=device-width, initial-scale=1, shrink-to-fit=no">
<meta name="description" content="">
<meta name="author" content="">
<link rel="icon" href="../../../../favicon.ico">

<title>Godot AI Gym</title>

<!-- Bootstrap core CSS -->
<link href="https://getbootstrap.com/docs/4.1/dist/css/bootstrap.min.css" rel="stylesheet">

<!-- Custom styles for this template -->
<link href="css/tmp.css" rel="stylesheet">

<script src="js/highlight.pack.js"></script>
<script>hljs.initHighlightingOnLoad();</script>
</head>

<body>

<nav class="navbar navbar-expand-md navbar-dark bg-dark fixed-top">
<a class="navbar-brand" href="https://github.com/lupoglaz/GodotGymAI">GitHub</a>
<button class="navbar-toggler" type="button" data-toggle="collapse" data-target="#navbarsExampleDefault" aria-controls="navbarsExampleDefault" aria-expanded="false" aria-label="Toggle navigation">
<span class="navbar-toggler-icon"></span>
</button>

<div class="collapse navbar-collapse" id="navbarsExampleDefault">
<ul class="navbar-nav mr-auto">
<li class="nav-item active">
<a class="nav-link" href="index.html">Home<span class="sr-only"></span></a>
</li>
<li class="nav-item active">
<a class="nav-link" href="API.html">API<span class="sr-only"></span></a>
</li>
<li class="nav-item">
<a class="nav-link" href="tutorial.html">Tutorial<span class="sr-only">(current)</span></a>
</li>
</ul>
</div>
</nav>

<main role="main" class="container-fluid">
<div class="starter-template">
<h1>Tutorial</h1>
</div>
<div class="container-fluid">
<div class="row">
<div class="col">
<h5>Figure1: Project settings.</h5>
<img src="Fig/VsyncSettings.png" class="rounded mx-auto d-block float-center" alt="Training process" width=80%>
<img src="Fig/PhysicsSettings.png" class="rounded mx-auto d-block float-center" alt="Training process" width=80%>
</div>
<div class="col">
<h2>Internal clock</h2>
Imagine, that you want to simulate phyics in godot engine with the timestep <b>target_delta</b>. Normally it should be
smaller than <b>deltat</b>, that we used in the previous tutorial. In this case we perform <b>deltat</b>/<b>target_delta</b>
physics iterations between getting an action and returning the observation. The engine usually assumes that <b>target_delta</b> and
<b>deltat</b> correspond to the real time. However, we want to set the scale to the fastest possible, that you can get set
using <b>Engine.set_time_scale</b>. The time scale corresponding to the real clock is 1, but we want each frame to perform physics
step with <b>target_delta</b> time step. In this case we have to set the time scale to FPS*<b>target_delta</b>.<br><br>
Measuring FPS is a tricky part, because when we are waiting for the next action, the system clock does not stop and the engine
thinks that the frame takes the time spent in python script. <br><br>
Code 1 shows how to mitigate this problem. First we measure the time it takes to free <b>sem_action</b> semaphore.
We also measure FPS manually in the <b>_process</b> function and remove the semaphore time from our FPS estimate. Then
we set engine iterations per second to our estimate and adjust time scale correspondingly. After this we have to zero the
<b>sem_time</b>, so that it does not affect the frames where there's no communication between godot and python.<br><br>
We designed two unit tests to check that the resulting time scaling does not affect the result of the physics simulation.
One can find them here: <a href="https://github.com/lupoglaz/GodotGymAI/tree/master/UnitTests">Tests</a>. <br><br>
Finally, you need to set Vsync off and physics timestep to fixed (Figure1).

</div>
<div class="col">
<h5>Code 1: Example of time scaling.</h5>
<pre class="pre"><code class="python">
var deltat = 0.05
var prev_time = 0.0
var sem_delta = 0.0
var target_delta = 0.025

func _physics_process(delta):
if timeout:
if mem.exists():
var time_start = OS.get_ticks_usec()
sem_action.wait()
var time_end = OS.get_ticks_usec()
sem_delta = time_end - time_start
...

func _process(delta):
if mem.exists():
var cur_time = OS.get_ticks_usec()
var fps_est = 1000000.0/(cur_time - prev_time - sem_delta)
Engine.set_iterations_per_second(fps_est)
Engine.set_time_scale(Engine.get_iterations_per_second()*target_delta)
sem_delta = 0.0
prev_time = cur_time
</code></pre>
</div>
</div>

<div class="row">
<div class="col">
<h5>Figure2: Server export templates.</h5>
<img src="Fig/ServerTemplates.png" class="rounded mx-auto d-block float-center" alt="Training process" width=80%>
</div>
<div class="col">
<h2>Server export</h2>
When one debugs the environment, there's usually no need for rendering during the training. Using server export presets can
speed up the execution of the environment. When the installation script runs, it also compiles this preset, so you only need
to export the project using them.<br><br><br><br>
</div>
<div class="col">
</div>
</div>

<div class="row">
<div class="col">
</div>
<div class="col">
<h2>Parallelization in Python</h2>
For some algorithms, like PPO we can use one policy to get multiple samples of the environment. In this case
it's useful to do it in parallel. There are two options: implementing parallelism in the engine, by spawning
several copies of the environment scene or launching several godot processes from python. However, we haven't yet
tested it: you might encounter the problem with semaphore or shared memory handle naming problem. In particular when
multiple processes use the same shared memory and semaphores.<br><br><br><br>
</div>
<div class="col">
</div>
</div>

</div><!-- /.container -->
</main>

<!-- Bootstrap core JavaScript
================================================== -->
<!-- Placed at the end of the document so the pages load faster -->
<script src="https://code.jquery.com/jquery-3.3.1.slim.min.js" integrity="sha384-q8i/X+965DzO0rT7abK41JStQIAqVgRVzpbzo5smXKp4YfRvH+8abtTE1Pi6jizo" crossorigin="anonymous"></script>
<script>window.jQuery || document.write('<script src="https://getbootstrap.com/docs/4.1/assets/js/vendor/jquery-slim.min.js"><\/script>')</script>
<script src="https://getbootstrap.com/docs/4.1/assets/js/vendor/popper.min.js"></script>
<script src="https://getbootstrap.com/docs/4.1/dist/js/bootstrap.min.js"></script>
</body>
</html>
Loading

0 comments on commit 570ab4f

Please sign in to comment.