-
Notifications
You must be signed in to change notification settings - Fork 16
/
Copy pathcaching.html
34 lines (34 loc) · 15.5 KB
/
caching.html
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
<html><head>
<meta http-equiv="Content-Type" content="text/html; charset=ISO-8859-1">
<title>Chapter 12. JanusGraph Cache</title><meta name="generator" content="DocBook XSL Stylesheets V1.78.1"><link rel="home" href="index.html" title="JanusGraph Documentation"><link rel="up" href="basics.html" title="Part II. JanusGraph Basics"><link rel="prev" href="tx.html" title="Chapter 11. Transactions"><link rel="next" href="log.html" title="Chapter 13. Transaction Log"><script xmlns:d="http://docbook.org/ns/docbook" type="text/javascript" src="js/jquery/jquery-1.11.0.js"></script><script xmlns:d="http://docbook.org/ns/docbook" type="text/javascript" src="js/jquery/jquery-migrate-1.2.1.min.js"></script><link xmlns:d="http://docbook.org/ns/docbook" rel="stylesheet" id="inline-blob-janusgraph-docs-specific" href="css/docs.css" type="text/css" media="all"><link xmlns:d="http://docbook.org/ns/docbook" rel="apple-touch-icon" type="image/png" href="images/janusgraph-logomark.png"><script xmlns:d="http://docbook.org/ns/docbook" type="text/javascript">
WebFontConfig = {
google: {
families: [
"Lato:400,400italic,700,700italic:latin,greek-ext,cyrillic,latin-ext,greek,cyrillic-ext,vietnamese",
"Open+Sans:400,400italic,700,700italic:latin,greek-ext,cyrillic,latin-ext,greek,cyrillic-ext,vietnamese",
"Antic+Slab:400,400italic,700,700italic:latin,greek-ext,cyrillic,latin-ext,greek,cyrillic-ext,vietnamese"
]
}
};
(function() {
var wf = document.createElement('script');
wf.src = ('https:' == document.location.protocol ? 'https' : 'http') +
'://ajax.googleapis.com/ajax/libs/webfont/1/webfont.js';
wf.type = 'text/javascript';
wf.async = 'true';
var s = document.getElementsByTagName('script')[0];
s.parentNode.insertBefore(wf, s);
})();
</script></head><body xmlns:d="http://docbook.org/ns/docbook" bgcolor="white" text="black" link="#0000FF" vlink="#840084" alink="#0000FF"><div id="wrapper"><div class="header-wrapper"><header id="header"><ul class="header-list"><li class="header-item"><a href="http://janusgraph.org"><img src="images/janusgraph-logo.png" alt="JanusGraph" class="normal_logo"></a></li><li class="header-item-right"><a href="https://github.com/JanusGraph/janusgraph/releases">Download JanusGraph</a></li><li class="header-item-right dropdown"><a href="https://docs.janusgraph.org/latest/doc-versions.html">Other Doc Versions</a><div class="dropdown-content"><a href="https://docs.janusgraph.org/latest/index.html">Latest</a><a href="https://docs.janusgraph.org/0.3.0/index.html">Version 0.3.0</a><a href="https://docs.janusgraph.org/0.2.2/index.html">Version 0.2.2</a><a href="https://docs.janusgraph.org/0.2.1/index.html">Version 0.2.1</a><a href="https://docs.janusgraph.org/0.2.0/index.html">Version 0.2.0</a><a href="https://docs.janusgraph.org/0.1.1/index.html">Version 0.1.1</a><a href="https://docs.janusgraph.org/0.1.0/index.html">Version 0.1.0</a></div></li><li class="header-item-right"><a href="index.html">Documentation (0.2.2)</a></li></ul></header></div><div id="main" class="clearfix width-100"><div class="breadcrumbs"><span class="breadcrumb-link"><a href="index.html">JanusGraph Documentation</a></span> > <span class="breadcrumb-link"><a href="basics.html">JanusGraph Basics</a></span> > <span class="breadcrumb-node">JanusGraph Cache</span></div><div class="chapter"><div class="titlepage"><div><div><h2 class="title"><a name="caching"></a>Chapter 12. JanusGraph Cache</h2></div></div></div><div class="toc"><p><b>Table of Contents</b></p><dl class="toc"><dt><span class="section"><a href="caching.html#_caching">12.1. Caching</a></span></dt><dt><span class="section"><a href="caching.html#tx-cache">12.2. Transaction-Level Caching</a></span></dt><dd><dl><dt><span class="section"><a href="caching.html#_vertex_cache">12.2.1. Vertex Cache</a></span></dt><dt><span class="section"><a href="caching.html#_index_cache">12.2.2. Index Cache</a></span></dt></dl></dd><dt><span class="section"><a href="caching.html#db-cache">12.3. Database Level Caching</a></span></dt><dd><dl><dt><span class="section"><a href="caching.html#_cache_expiration_time">12.3.1. Cache Expiration Time</a></span></dt><dt><span class="section"><a href="caching.html#_cache_size">12.3.2. Cache Size</a></span></dt><dt><span class="section"><a href="caching.html#_clean_up_wait_time">12.3.3. Clean Up Wait Time</a></span></dt></dl></dd><dt><span class="section"><a href="caching.html#_storage_backend_caching">12.4. Storage Backend Caching</a></span></dt></dl></div><div class="section"><div class="titlepage"><div><div><h2 class="title" style="clear: both"><a name="_caching"></a>12.1. Caching</h2></div></div></div><p>JanusGraph employs multiple layers of data caching to facilitate fast graph traversals. The caching layers are listed here in the order they are accessed from within a JanusGraph transaction. The closer the cache is to the transaction, the faster the cache access and the higher the memory footprint and maintenance overhead.</p></div><div class="section"><div class="titlepage"><div><div><h2 class="title" style="clear: both"><a name="tx-cache"></a>12.2. Transaction-Level Caching</h2></div></div></div><p>Within an open transaction, JanusGraph maintains two caches:</p><div class="itemizedlist"><ul class="itemizedlist" style="list-style-type: disc; "><li class="listitem">Vertex Cache: Caches accessed vertices and their adjacency list (or subsets thereof) so that subsequent access is significantly faster within the same transaction. Hence, this cache speeds up iterative traversals.</li><li class="listitem">Index Cache: Caches the results for index queries so that subsequent index calls can be served from memory instead of calling the index backend and (usually) waiting for one or more network round trips.</li></ul></div><p>The size of both of those is determined by the <span class="emphasis"><em>transaction cache size</em></span>. The
transaction cache size can be configured via <code class="literal">cache.tx-cache-size</code> or on a
per transaction basis by opening a transaction via the transaction builder
<code class="literal">graph.buildTransaction()</code> and using the <code class="literal">setVertexCacheSize(int)</code> method.</p><div class="section"><div class="titlepage"><div><div><h3 class="title"><a name="_vertex_cache"></a>12.2.1. Vertex Cache</h3></div></div></div><p>The vertex cache contains vertices and the subset of their adjacency list that has been retrieved in a particular transaction. The maximum number of vertices maintained in this cache is equal to the transaction cache size. If the transaction workload is an iterative traversal, the vertex cache will significantly speed it up. If the same vertex is not accessed again in the transaction, the transaction level cache will make no difference.</p><p>Note, that the size of the vertex cache on heap is not only determined by the number of vertices it may hold but also by the size of their adjacency list. In other words, vertices with large adjacency lists (i.e. many incident edges) will consume more space in this cache than those with smaller lists.</p><p>Furthermore note, that modified vertices are <span class="emphasis"><em>pinned</em></span> in the cache, which means they cannot be evicted since that would entail loosing their changes. Therefore, transaction which contain a lot of modifications may end up with a larger than configured vertex cache.</p></div><div class="section"><div class="titlepage"><div><div><h3 class="title"><a name="_index_cache"></a>12.2.2. Index Cache</h3></div></div></div><p>The index cache contains the results of index queries executed in the context of this transaction. Subsequent identical index calls will be served from this cache and are therefore significantly cheaper. If the same index call never occurs twice in the same transaction, the index cache makes no difference.</p><p>Each entry in the index cache is given a weight equal to <code class="literal">2 + result set size</code> and the total weight of the cache will not exceed half of the transaction cache size.</p></div></div><div class="section"><div class="titlepage"><div><div><h2 class="title" style="clear: both"><a name="db-cache"></a>12.3. Database Level Caching</h2></div></div></div><p>The database level cache retains adjacency lists (or subsets thereof) across multiple transactions and beyond the duration of a single transaction. The database level cache is shared by all transactions across a database. It is more space efficient than the transaction level caches but also slightly slower to access. In contrast to the transaction level caches, the database level caches do not expire immediately after closing a transaction. Hence, the database level cache significantly speeds up graph traversals for read heavy workloads across transactions.</p><p><a class="xref" href="config-ref.html" title="Chapter 14. Configuration Reference">Chapter 14, <i>Configuration Reference</i></a> lists all of the configuration options that pertain to JanusGraph’s database level cache. This page attempts to explain their usage.</p><p>Most importantly, the database level cache is disabled by default in the current release version of JanusGraph. To enable it, set <code class="literal">cache.db-cache=true</code>.</p><div class="section"><div class="titlepage"><div><div><h3 class="title"><a name="_cache_expiration_time"></a>12.3.1. Cache Expiration Time</h3></div></div></div><p>The most important setting for performance and query behavior is the cache expiration time which is configured via <code class="literal">cache.db-cache-time</code>. The cache will hold graph elements for at most that many milliseconds. If an element expires, the data will be re-read from the storage backend on the next access.</p><p>If there is only one JanusGraph instance accessing the storage backend or if this instance is the only one modifying the graph, the cache expiration can be set to 0 which disables cache expiration. This allows the cache to hold elements indefinitely (unless they are evicted due to space constraints or on update) which provides the best cache performance. Since no other JanusGraph instance is modifying the graph, there is no danger of holding on to stale data.</p><p>If there are multiple JanusGraph instances accessing the storage backend, the time should be set to the maximum time that can be allowed between <span class="strong"><strong>another</strong></span> JanusGraph instance modifying the graph and this JanusGraph instance seeing the data.
If any change should be immediately visible to all JanusGraph instances, the database level cache should be disabled in a distributed setup. However, for most applications it is acceptable that a particular JanusGraph instance sees remote modifications with some delay. The larger the maximally allowed delay, the better the cache performance.
Note, that a given JanusGraph instance will always immediately see its own modifications to the graph irrespective of the configured cache expiration time.</p></div><div class="section"><div class="titlepage"><div><div><h3 class="title"><a name="_cache_size"></a>12.3.2. Cache Size</h3></div></div></div><p>The configuration option <code class="literal">cache.db-cache-size</code> controls how much heap space JanusGraph’s database level cache is allowed to consume. The larger the cache, the more effective it will be. However, large cache sizes can lead to excessive GC and poor performance.</p><p>The cache size can be configured as a percentage (expressed as a decimal between 0 and 1) of the total heap space available to the JVM running JanusGraph or as an absolute number of bytes.</p><p>Note, that the cache size refers to the amount of heap space that is exclusively occupied by the cache. JanusGraph’s other data structures and each open transaction will occupy additional heap space. If additional software layers are running in the same JVM, those may occupy a significant amount of heap space as well (e.g. Gremlin Server, embedded Cassandra, etc). Be conservative in your heap memory estimation. Configuring a cache that is too large can lead to out-of-memory exceptions and excessive GC.</p></div><div class="section"><div class="titlepage"><div><div><h3 class="title"><a name="_clean_up_wait_time"></a>12.3.3. Clean Up Wait Time</h3></div></div></div><p>When a vertex is locally modified (e.g. an edge is added) all of the vertex’s related database level cache entries are marked as expired and eventually evicted. This will cause JanusGraph to refresh the vertex’s data from the storage backend on the next access and re-populate the cache.</p><p>However, when the storage backend is eventually consistent, the modifications that triggered the eviction may not yet be visible. By configuring <code class="literal">cache.db-cache-clean-wait</code>, the cache will wait for at least this many milliseconds before repopulating the cache with the entry retrieved from the storage backend.</p><p>If JanusGraph runs locally or against a storage backend that guarantees immediate visibility of modifications, this value can be set to 0.</p></div></div><div class="section"><div class="titlepage"><div><div><h2 class="title" style="clear: both"><a name="_storage_backend_caching"></a>12.4. Storage Backend Caching</h2></div></div></div><p>Each storage backend maintains its own data caching layer. These caches benefit from compression, data compactness, coordinated expiration and are often maintained off heap which means that large caches can be used without running into garbage collection issues. While these caches can be significantly larger than the database level cache, they are also slower to access.</p><p>The exact type of caching and its properties depends on the particular <a class="link" href="storage-backends.html" title="Part III. Storage Backends">storage backend</a>. Please refer to the respective documentation for more information about the caching infrastructure and how to optimize it.</p></div></div></div><div class="clearer"></div><div class="navfooter"><hr><table width="100%" summary="Navigation footer"><tr><td width="40%" align="left"><a accesskey="p" href="tx.html">Prev</a> </td><td width="20%" align="center"><a accesskey="u" href="basics.html">Up</a></td><td width="40%" align="right"> <a accesskey="n" href="log.html">Next</a></td></tr><tr><td width="40%" align="left" valign="top">Chapter 11. Transactions </td><td width="20%" align="center"><a accesskey="h" href="index.html">Home</a></td><td width="40%" align="right" valign="top"> Chapter 13. Transaction Log</td></tr></table></div><div class="footer-wrapper"><footer id="footer"><div class="copyright">
Copyright © 2017 JanusGraph Authors. All rights reserved.<br>
The Linux Foundation has registered trademarks and uses trademarks. For a list of<br>
trademarks of The Linux Foundation, please see our <a href="https://www.linuxfoundation.org/trademark-usage">Trademark Usage</a> page.<br>
Cassandra, Groovy, HBase, Hadoop, Lucene, Solr, and TinkerPop are trademarks of the Apache Software Foundation.<br>
Berkeley DB and Berkeley DB Java Edition are trademarks of Oracle.<br>
Documentation generated with <a href="http://www.methods.co.nz/asciidoc/">AsciiDoc</a>, <a href="http://asciidoctor.org/">AsciiDoctor</a>, <a href="http://docbook.sourceforge.net/">DocBook</a>, and <a href="http://saxon.sourceforge.net/">Saxon</a>.
</div></footer></div></div></body></html>