[SPARK-25583][DOC] Add history-server related configuration in the do…

…cumentation. ## What changes were proposed in this pull request? Add history-server related configuration in the documentation. Some of the history server related configurations were missing in the documentation.Like, 'spark.history.store.maxDiskUsage', 'spark.ui.liveUpdate.period' etc. ## How was this patch tested? ![screenshot from 2018-10-01 20-58-26](https://user-images.githubusercontent.com/23054875/46298568-04833a80-c5bd-11e8-95b8-54c9d6582fd2.png) ![screenshot from 2018-10-01 20-59-31](https://user-images.githubusercontent.com/23054875/46298591-11a02980-c5bd-11e8-93d0-892afdfd4f9a.png) ![screenshot from 2018-10-01 20-59-45](https://user-images.githubusercontent.com/23054875/46298601-1533b080-c5bd-11e8-9689-e9b39882a7b5.png) Closes apache#22601 from shahidki31/historyConf. Authored-by: Shahid <[email protected]> Signed-off-by: Dongjoon Hyun <[email protected]>
mantov5 · Oct 2, 2018 · 7187663 · 7187663
1 parent 5114db5
commit 7187663
Show file tree

Hide file tree

Showing 2 changed files with 41 additions and 0 deletions.
diff --git a/docs/configuration.md b/docs/configuration.md
@@ -793,6 +793,13 @@ Apart from these, the following properties are also available, and may be useful
     Buffer size to use when writing to output streams, in KiB unless otherwise specified.
   </td>
 </tr>
+<tr>
+  <td><code>spark.ui.dagGraph.retainedRootRDDs</code></td>
+  <td>Int.MaxValue</td>
+  <td>
+    How many DAG graph nodes the Spark UI and status APIs remember before garbage collecting.
+  </td>
+</tr>
 <tr>
   <td><code>spark.ui.enabled</code></td>
   <td>true</td>
@@ -807,6 +814,15 @@ Apart from these, the following properties are also available, and may be useful
     Allows jobs and stages to be killed from the web UI.
   </td>
 </tr>
+<tr>
+  <td><code>spark.ui.liveUpdate.period</code></td>
+  <td>100ms</td>
+  <td>
+    How often to update live entities. -1 means "never update" when replaying applications,
+    meaning only the last write will happen. For live applications, this avoids a few
+    operations that we can live without when rapidly processing incoming task events.
+  </td>
+</tr>
 <tr>
   <td><code>spark.ui.port</code></td>
   <td>4040</td>

diff --git a/docs/monitoring.md b/docs/monitoring.md
@@ -185,13 +185,38 @@ Security options for the Spark History Server are covered more detail in the
       Job history files older than this will be deleted when the filesystem history cleaner runs.
     </td>
   </tr>
+  <tr>
+    <td>spark.history.fs.endEventReparseChunkSize</td>
+    <td>1m</td>
+    <td>
+      How many bytes to parse at the end of log files looking for the end event. 
+      This is used to speed up generation of application listings by skipping unnecessary
+      parts of event log files. It can be disabled by setting this config to 0.
+    </td>
+  </tr>
+  <tr>
+    <td>spark.history.fs.inProgressOptimization.enabled</td>
+    <td>true</td>
+    <td>
+      Enable optimized handling of in-progress logs. This option may leave finished
+      applications that fail to rename their event logs listed as in-progress.
+    </td>
+  </tr>
   <tr>
     <td>spark.history.fs.numReplayThreads</td>
     <td>25% of available cores</td>
     <td>
       Number of threads that will be used by history server to process event logs.
     </td>
   </tr>
+  <tr>
+    <td>spark.history.store.maxDiskUsage</td>
+    <td>10g</td>
+    <td>
+      Maximum disk usage for the local directory where the cache application history information
+      are stored.
+    </td>
+  </tr>
   <tr>
     <td>spark.history.store.path</td>
     <td>(none)</td>