Speed up tutorials

Alex2782 · Nov 24, 2020 · 570ab4f · 570ab4f
1 parent 88d8ff0
commit 570ab4f
Show file tree

Hide file tree

Showing 11 changed files with 179 additions and 38 deletions.
diff --git a/GodotModule/cSharedMemory.h b/GodotModule/cSharedMemory.h
@@ -32,12 +32,6 @@ typedef allocator<int, managed_shared_memory::segment_manager>  ShmemAllocator;
 typedef std::vector<int, ShmemAllocator> IntVector;
 typedef std::vector<float, ShmemAllocator> FloatVector;
 
-
-struct TensorDescription{
-    std::string type;       //Tensor scalar type
-    TensorDescription(const std::string &t_type):type(t_type){}
-};
-
 class cSharedMemory : public Reference {
     GDCLASS(cSharedMemory, Reference);
 

diff --git a/Tutorials/InvPendulumTut/InvPendulum/Environment.gd b/Tutorials/InvPendulumTut/InvPendulum/Environment.gd
@@ -7,7 +7,6 @@ var sem_action
 var sem_observation
 var sem_reset
 var mem
-#onready var policy_data = preload("res://ddpg_policy.jit")
 onready var policy_data = load("res://ddpg_policy.tres")
 var policy
 var policy_action
@@ -34,6 +33,7 @@ func _ready():
 		sem_observation.init("sem_observation")
 		print("Running as OpenAIGym environment")
 	else:
+		#pass
 		policy = cTorchModel.new()
 		policy.set_data(policy_data)
 
@@ -88,7 +88,7 @@ func _physics_process(delta):
 				env_action[1] = 1
 			if policy_action != null:
 				agent_action = policy_action
-				agent_action[0]*=8.0
+			agent_action[0]*=8.0
 
 		$ActionLabel.text = "Action: "+str(agent_action)
 
@@ -116,6 +116,7 @@ func _on_Timer_timeout():
 		mem.sendIntArray("done", [is_done()])
 		sem_observation.post()
 	else:
+		#pass
 		policy_action = policy.run(observation)
 
 	time_elapsed += deltat

diff --git a/docs/Fig/ExportTemplates.png b/docs/Fig/ExportTemplates.png
diff --git a/docs/Fig/PhysicsSettings.png b/docs/Fig/PhysicsSettings.png
diff --git a/docs/Fig/ServerTemplates.png b/docs/Fig/ServerTemplates.png
diff --git a/docs/Fig/SpeedUpPic.png b/docs/Fig/SpeedUpPic.png
diff --git a/docs/Fig/VsyncSettings.png b/docs/Fig/VsyncSettings.png
diff --git a/docs/index.html b/docs/index.html
@@ -36,7 +36,7 @@
             <a class="nav-link" href="API.html">API<span class="sr-only"></span></a>
             </li>
             <li class="nav-item active">
-            <a class="nav-link" href="tutorial.html">Tutorial<span class="sr-only"></span></a>
+            <a class="nav-link" href="tutorial_basic.html">Tutorial<span class="sr-only"></span></a>
             </li>
         </ul>
         </div>
@@ -61,12 +61,18 @@ <h2>API</h2>
           <img class="rounded-circle" src="Fig/Process.png" alt="Generic placeholder image" width="140" height="140">
           <h2>Basic Tutorial</h2>
           <p>This tutorial guides through the basics of setting up an environment. It also discusses the way godot and python processes communicate.</p>
-          <p><a class="btn btn-secondary" href="tutorial.html" role="button">View &raquo;</a></p>
+          <p><a class="btn btn-secondary" href="tutorial_basic.html" role="button">View &raquo;</a></p>
+        </div><!-- /.col-lg-4 -->
+        <div class="col-lg-4">
+          <img class="rounded-circle" src="Fig/SpeedUpPic.png" alt="Generic placeholder image" width="140" height="140">
+          <h2>Speedup Tutorial</h2>
+          <p>Guides through the process of executing godot process faster than real time.
+          <p><a class="btn btn-secondary" href="tutorial_speedup.html" role="button">View &raquo;</a></p>
         </div><!-- /.col-lg-4 -->
         <div class="col-lg-4">
           <img class="rounded-circle" src="data:image/gif;base64,R0lGODlhAQABAIAAAHd3dwAAACH5BAAAAAAALAAAAAABAAEAAAICRAEAOw==" alt="Generic placeholder image" width="140" height="140">
           <h2>Training Tutorial</h2>
-          <p>Guides through the process of training your model in pytorch. <br> Work in progress...</p>
+          <p>Training your model using DDPG. <br> Work in progress...</p>
           <p><a class="btn btn-secondary disabled" href="#" role="button">View &raquo;</a></p>
         </div><!-- /.col-lg-4 -->
         <div class="col-lg-4">

diff --git a/docs/tutorial.html → docs/tutorial_basic.html b/docs/tutorial.html → docs/tutorial_basic.html
@@ -53,21 +53,21 @@ <h1>Tutorial</h1>
             <div class="col">
                 <h2>Introduction</h2>
                 When training reinforcement learning agents, the agent interacts with the environment by sending actions and receiving observations.
-                The agents are trained in the python script and the environment is implemented using Godot. <br>
+                The agents are trained in a python script and the environment is implemented using Godot. <br>
                 In python the environment is wrapped into a class, that is usually similar to OpenAI Gym environment class (Code 1). We need to implement 
                 the functions: <b>init</b>, <b>step</b>, <b>reset</b> and <b>close</b> to get fully functional environment. <br><br>
                 The <b>init</b> function launches subprocesses associated with your environment. In our case it launches Godot 
                 project as a subprocess.<br>
                 The <b>step</b> function takes an action tensor that was sampled from your model and passes it to the environment. It then computes 
                 the next state and returns the observation of this state along with the reward. The variable <b>done</b> is <b>true</b> if the new state is the 
-                final one, <b>true</b> otherwise.<br>
-                The <b>reset</b> function returns the environment to the initial state and return the observation of this state.<br>
-                The <b>close</b> function closes any subprocesses associated with your environment.<br>
+                final one, <b>false</b> otherwise.<br>
+                The <b>reset</b> function returns the environment to the initial state and returns the observation of this state.<br>
+                The <b>close</b> function closes all subprocesses associated with your environment.<br>
                 We will ignore <b>seed</b> and <b>render</b> functions for now, because we use random number generator in Godot 
                 and rendering is done by default.<br>
             </div>
             <div class="col">
-                <h5>Code 1: Dummy implementation of OpenAI Gym class.</h5>
+                <h5>Code 1: Dummy implementation of an OpenAI Gym class.</h5>
                 <pre class="pre"><code class="python">
 class DummyEnv(gym.Env):
     def __init__(self):
@@ -135,8 +135,8 @@ <h2>Synchronization in Python</h2>
                 When the <b>step</b> function is called the action tensor is sent into the shared memory and <b>sem_act</b> semaphore
                 turns green. It signals to godot to start processing the action of an agent. Meanwhile the semaphore <b>sem_obs</b> turns
                 red, blocking python process while waiting for godot to send the observation.<br>
-                In Godot we have to pay additional attention to server process to make sure physics server is synchronized with the 
-                semaphores.
+                In Godot we have to put synchronization procedures in a <b>_physics_process</b> if we have physics-based environment. In other cases 
+                we can use standard <b>_process</b> function.
             </div>
             <div class="col">
                 <h5>Code 3: Introducing semaphores.</h5>
@@ -192,12 +192,7 @@ <h2>Synchronization in Godot</h2>
                     The engine waits for the actions in the function <b>_physics_process</b> and starts <b>Timer</b> count down. <br>
                     When the <b>Timer</b> returns the signal <b>_on_Timer_timeout</b> we send the observation back to python and release the 
                     semaphore <b>sem_obs</b>. Also, we signal to the engine, that we wait for the next action by setting <b>timeout</b> variable to true.
-                    <br><br>
-                    We generally want the training to be as fast as possible, so we need to make the engine process faster than real time. It is done 
-                    by setting the <b>iterations_per_second</b> property of the engine to the FPS and increasing the timescale to some fraction of 
-                    the <b>iterations_per_second</b>. We use only fraction of the FPS, because we want the engine to 
-                    perform with reasonably low delta, such that 
-                    there are no obvious bugs in the physics. I can not guarantee that this way of increasing the processing speed is valid though.
+                    <br>
                 </div>
                 <div class="col">
                     <h5>Code 4: Environment node script</h5>
@@ -214,10 +209,7 @@ <h5>Code 4: Environment node script</h5>
     set_physics_process(true)
 
 func _physics_process(delta):
-    if timeout:
-        Engine.iterations_per_second = max(60, Engine.get_frames_per_second())
-        Engine.time_scale = max(1.0, Engine.iterations_per_second/100.0)
-
+    if timeout:    
         if mem.exists():
             sem_action.wait()
             agent_action = mem.getIntArray("agent_action")
@@ -244,11 +236,11 @@ <h2>Resetting the environment</h2>
                 When the episode of the simulation ends we want to reset the environment to the initial position without relaunching 
                 the engine process. To accomplish this feat we use additional action, that is 1 if the environment should be reset and 
                 0 otherwise. However, the tricky part is resetting the positions and velocities of the objects in Godot.<br>
-                In our case we have one physics object: <b>RigidBody2D</b>. The positions and velicities of the <b>RigidBody2D</b> node can be changed
+                In our case we have one physics object: <b>RigidBody2D</b>. The positions and velocities of the <b>RigidBody2D</b> node can be changed
                 only in the <b>_integrate_forces</b> function of this node. Therefore we introduce the variable <b>reset</b>, which is true when we 
                 want to reinitialize this node and false otherwise. Additionally we create variables to store initial positions and velocities of this node.<br>
                 The Code 5 shows the gist of this function. First we compute absolute initial <b>Transform2D</b> of the node and change the state accordingly.<br>
-                Unfortunatelly, we have to compute the transforms of <b>Anchor</b> and <b>PinJoint2D</b> tranforms in the <b>Environment</b> script (Code 6).
+                Unfortunatelly, we have to compute the transforms of <b>Anchor</b> and <b>PinJoint2D</b> in the <b>Environment</b> script (Code 6).
                 Probably, one can avoid this cumbersome procedure by reorganizing the tree or accessing the parent nodes from the <b>RigidBody2D</b> itself.
             </div>
             <div class="col">
@@ -303,9 +295,9 @@ <h5>Code 7: Launching the exported environment</h5>
 class DummyEnvLaunch(gym.Env):
     def __init__(self, exec_path, env_path):
         with open("stdout.txt","wb") as out, open("stderr.txt","wb") as err:
-            self.process = subprocess.Popen([exec_path, 
-            "--path", os.path.abspath(env_path),
-            "--handle", "environment"], stdout=out, stderr=err)
+            self.process = subprocess.Popen([exec_path, "--path", 
+            os.path.abspath(env_path), 
+            "--handle", self.handle], stdout=out, stderr=err)
 
         atexit.register(self.close)
 
@@ -319,9 +311,10 @@ <h5>Code 7: Launching the exported environment</h5>
             </div>
             <div class="col">
                 <h2>Conclusion</h2>
-                In this tutorial we showed the main steps to make your own environment in Godot Engine and use it from python script. 
+                In this tutorial we showed the first step to make your own environment in Godot Engine and use it from python script.
+                Right now the godot environment runs in the real time. In the next tutorial we will show how to speed it up.
                 The fully functional code for this tutorial can be found here: <br><br>
-                <a href="https://github.com/lupoglaz/GodotGymAI/tree/master/Environments">Environments</a><br><br>
+                <a href="https://github.com/lupoglaz/GodotGymAI/tree/master/Tutorials/InvPendulumTut">Tutorials</a><br><br>
                 The Godot project is located in the directory <b>InvPendulum</b> and the python Gym class is in the file <b>InvPendulum.py</b>.
 
                 <br>

diff --git a/docs/tutorial_speedup.html b/docs/tutorial_speedup.html
@@ -0,0 +1,145 @@
+<!doctype html>
+<html lang="en">
+  <head>
+    <meta charset="utf-8">
+    <meta name="viewport" content="width=device-width, initial-scale=1, shrink-to-fit=no">
+    <meta name="description" content="">
+    <meta name="author" content="">
+    <link rel="icon" href="../../../../favicon.ico">
+
+    <title>Godot AI Gym</title>
+
+    <!-- Bootstrap core CSS -->
+    <link href="https://getbootstrap.com/docs/4.1/dist/css/bootstrap.min.css" rel="stylesheet">
+
+    <!-- Custom styles for this template -->
+    <link href="css/tmp.css" rel="stylesheet">
+
+    <script src="js/highlight.pack.js"></script>
+	<script>hljs.initHighlightingOnLoad();</script>
+  </head>
+
+  <body>
+
+    <nav class="navbar navbar-expand-md navbar-dark bg-dark fixed-top">
+      <a class="navbar-brand" href="https://github.com/lupoglaz/GodotGymAI">GitHub</a>
+      <button class="navbar-toggler" type="button" data-toggle="collapse" data-target="#navbarsExampleDefault" aria-controls="navbarsExampleDefault" aria-expanded="false" aria-label="Toggle navigation">
+        <span class="navbar-toggler-icon"></span>
+      </button>
+
+      <div class="collapse navbar-collapse" id="navbarsExampleDefault">
+        <ul class="navbar-nav mr-auto">
+            <li class="nav-item active">
+                <a class="nav-link" href="index.html">Home<span class="sr-only"></span></a>
+            </li>
+            <li class="nav-item active">
+            <a class="nav-link" href="API.html">API<span class="sr-only"></span></a>
+          </li>
+          <li class="nav-item">
+            <a class="nav-link" href="tutorial.html">Tutorial<span class="sr-only">(current)</span></a>
+          </li>
+        </ul>
+      </div>
+    </nav>
+
+<main role="main" class="container-fluid">
+    <div class="starter-template">
+        <h1>Tutorial</h1>
+    </div>
+    <div class="container-fluid">
+        <div class="row">
+            <div class="col">
+                <h5>Figure1: Project settings.</h5>
+                <img src="Fig/VsyncSettings.png" class="rounded mx-auto d-block float-center" alt="Training process" width=80%>
+                <img src="Fig/PhysicsSettings.png" class="rounded mx-auto d-block float-center" alt="Training process" width=80%>
+            </div>
+            <div class="col">
+                <h2>Internal clock</h2>
+                Imagine, that you want to simulate phyics in godot engine with the timestep <b>target_delta</b>. Normally it should be 
+                smaller than <b>deltat</b>, that we used in the previous tutorial. In this case we perform <b>deltat</b>/<b>target_delta</b>
+                physics iterations between getting an action and returning the observation. The engine usually assumes that <b>target_delta</b> and 
+                <b>deltat</b> correspond to the real time. However, we want to set the scale to the fastest possible, that you can get set 
+                using <b>Engine.set_time_scale</b>. The time scale corresponding to the real clock is 1, but we want each frame to perform physics 
+                step with <b>target_delta</b> time step. In this case we have to set the time scale to FPS*<b>target_delta</b>.<br><br>
+                Measuring FPS is a tricky part, because when we are waiting for the next action, the system clock does not stop and the engine 
+                thinks that the frame takes the time spent in python script. <br><br>
+                Code 1 shows how to mitigate this problem. First we measure the time it takes to free <b>sem_action</b> semaphore. 
+                We also measure FPS manually in the <b>_process</b> function and remove the semaphore time from our FPS estimate. Then 
+                we set engine iterations per second to our estimate and adjust time scale correspondingly. After this we have to zero the 
+                <b>sem_time</b>, so that it does not affect the frames where there's no communication between godot and python.<br><br>
+                We designed two unit tests to check that the resulting time scaling does not affect the result of the physics simulation. 
+                One can find them here: <a href="https://github.com/lupoglaz/GodotGymAI/tree/master/UnitTests">Tests</a>. <br><br>
+                Finally, you need to set Vsync off and physics timestep to fixed (Figure1).
+
+            </div>
+            <div class="col">
+                <h5>Code 1: Example of time scaling.</h5>
+                <pre class="pre"><code class="python">
+var deltat = 0.05
+var prev_time = 0.0
+var sem_delta = 0.0
+var target_delta = 0.025
+
+func _physics_process(delta):
+    if timeout:
+        if mem.exists():
+            var time_start = OS.get_ticks_usec()
+            sem_action.wait()
+            var time_end = OS.get_ticks_usec()
+            sem_delta = time_end - time_start
+            ...
+
+func _process(delta):
+    if mem.exists():
+        var cur_time = OS.get_ticks_usec()
+        var fps_est = 1000000.0/(cur_time - prev_time - sem_delta)
+        Engine.set_iterations_per_second(fps_est)
+        Engine.set_time_scale(Engine.get_iterations_per_second()*target_delta)
+        sem_delta = 0.0
+        prev_time = cur_time
+                </code></pre>
+            </div>
+        </div>
+
+        <div class="row">
+            <div class="col">
+                <h5>Figure2: Server export templates.</h5>
+                <img src="Fig/ServerTemplates.png" class="rounded mx-auto d-block float-center" alt="Training process" width=80%>
+            </div>
+            <div class="col">
+                <h2>Server export</h2>
+                When one debugs the environment, there's usually no need for rendering during the training. Using server export presets can 
+                speed up the execution of the environment. When the installation script runs, it also compiles this preset, so you only need
+                to export the project using them.<br><br><br><br>
+            </div>
+            <div class="col">
+            </div>
+        </div>
+
+        <div class="row">
+            <div class="col">
+            </div>
+            <div class="col">
+                <h2>Parallelization in Python</h2>
+                For some algorithms, like PPO we can use one policy to get multiple samples of the environment. In this case 
+                it's useful to do it in parallel. There are two options: implementing parallelism in the engine, by spawning 
+                several copies of the environment scene or launching several godot processes from python. However, we haven't yet
+                tested it: you might encounter the problem with semaphore or shared memory handle naming problem. In particular when 
+                multiple processes use the same shared memory and semaphores.<br><br><br><br>
+            </div>
+            <div class="col">
+            </div>
+        </div>
+
+    </div><!-- /.container -->
+</main>
+
+<!-- Bootstrap core JavaScript
+================================================== -->
+<!-- Placed at the end of the document so the pages load faster -->
+<script src="https://code.jquery.com/jquery-3.3.1.slim.min.js" integrity="sha384-q8i/X+965DzO0rT7abK41JStQIAqVgRVzpbzo5smXKp4YfRvH+8abtTE1Pi6jizo" crossorigin="anonymous"></script>
+<script>window.jQuery || document.write('<script src="https://getbootstrap.com/docs/4.1/assets/js/vendor/jquery-slim.min.js"><\/script>')</script>
+<script src="https://getbootstrap.com/docs/4.1/assets/js/vendor/popper.min.js"></script>
+<script src="https://getbootstrap.com/docs/4.1/dist/js/bootstrap.min.js"></script>
+</body>
+</html>