From b294ebcacfad0f7a452f172f016a0c7ee0dbbebd Mon Sep 17 00:00:00 2001
From: Ewen Cheslack-Postava <me@ewencp.org>
Date: Tue, 14 Feb 2017 10:34:48 -0800
Subject: [PATCH] Add 0.10.2 docs from 0.10.2.0 RC2

---
 0102/generated/topic_config.html |  4 ++--
 0102/quickstart.html             |  5 +----
 0102/streams.html                | 29 +++++++++++++++++++++--------
 3 files changed, 24 insertions(+), 14 deletions(-)
diff --git a/0102/generated/topic_config.html b/0102/generated/topic_config.html
index 87eb7bd73..897414716 100644
--- a/0102/generated/topic_config.html
+++ b/0102/generated/topic_config.html
@@ -21,11 +21,11 @@
 <tr>
 <td>flush.ms</td><td>This setting allows specifying a time interval at which we will force an fsync of data written to the log. For example if this was set to 1000 we would fsync after 1000 ms had passed. In general we recommend you not set this and use replication for durability and allow the operating system's background flush capabilities as it is more efficient.</td><td>long</td><td>9223372036854775807</td><td>[0,...]</td><td>log.flush.interval.ms</td><td>medium</td></tr>
 <tr>
-<td>follower.replication.throttled.replicas</td><td>A list of replicas for which log replication should be throttled on the follower side. The list should describe a set of replicas in the form [PartitionId]:[BrokerId],[PartitionId]:[BrokerId]:... or alternatively the wildcard '*' can be used to throttle all replicas for this topic.</td><td>list</td><td>""</td><td>kafka.server.ThrottledReplicaListValidator$@6c503cb2</td><td>follower.replication.throttled.replicas</td><td>medium</td></tr>
+<td>follower.replication.throttled.replicas</td><td>A list of replicas for which log replication should be throttled on the follower side. The list should describe a set of replicas in the form [PartitionId]:[BrokerId],[PartitionId]:[BrokerId]:... or alternatively the wildcard '*' can be used to throttle all replicas for this topic.</td><td>list</td><td>""</td><td>kafka.server.ThrottledReplicaListValidator$@7be8c2a2</td><td>follower.replication.throttled.replicas</td><td>medium</td></tr>
 <tr>
 <td>index.interval.bytes</td><td>This setting controls how frequently Kafka adds an index entry to it's offset index. The default setting ensures that we index a message roughly every 4096 bytes. More indexing allows reads to jump closer to the exact position in the log but makes the index larger. You probably don't need to change this.</td><td>int</td><td>4096</td><td>[0,...]</td><td>log.index.interval.bytes</td><td>medium</td></tr>
 <tr>
-<td>leader.replication.throttled.replicas</td><td>A list of replicas for which log replication should be throttled on the leader side. The list should describe a set of replicas in the form [PartitionId]:[BrokerId],[PartitionId]:[BrokerId]:... or alternatively the wildcard '*' can be used to throttle all replicas for this topic.</td><td>list</td><td>""</td><td>kafka.server.ThrottledReplicaListValidator$@6c503cb2</td><td>leader.replication.throttled.replicas</td><td>medium</td></tr>
+<td>leader.replication.throttled.replicas</td><td>A list of replicas for which log replication should be throttled on the leader side. The list should describe a set of replicas in the form [PartitionId]:[BrokerId],[PartitionId]:[BrokerId]:... or alternatively the wildcard '*' can be used to throttle all replicas for this topic.</td><td>list</td><td>""</td><td>kafka.server.ThrottledReplicaListValidator$@7be8c2a2</td><td>leader.replication.throttled.replicas</td><td>medium</td></tr>
 <tr>
 <td>max.message.bytes</td><td>This is largest message size Kafka will allow to be appended. Note that if you increase this size you must also increase your consumer's fetch size so they can fetch messages this large.</td><td>int</td><td>1000012</td><td>[0,...]</td><td>message.max.bytes</td><td>medium</td></tr>
 <tr>
diff --git a/0102/quickstart.html b/0102/quickstart.html
index bfc9af3f7..69f6a7a5d 100644
--- a/0102/quickstart.html
+++ b/0102/quickstart.html
@@ -359,7 +359,7 @@ <h4><a id="quickstart_kafkastreams" href="#quickstart_kafkastreams">Step 8: Use
 
 
 <pre>
-&gt; <b>cat file-input.txt | ./bin/kafka-console-producer --broker-list localhost:9092 --topic streams-file-input</b>
+&gt; <b>bin/kafka-console-producer.sh --broker-list localhost:9092 --topic streams-file-input < file-input.txt</b>
 </pre>
 
 <p>
@@ -397,12 +397,9 @@ <h4><a id="quickstart_kafkastreams" href="#quickstart_kafkastreams">Step 8: Use
 
 <pre>
 all     1
-streams 1
 lead    1
 to      1
-kafka   1
 hello   1
-kafka   2
 streams 2
 join    1
 kafka   3
diff --git a/0102/streams.html b/0102/streams.html
index 19af2b326..94ce7a951 100644
--- a/0102/streams.html
+++ b/0102/streams.html
@@ -497,25 +497,31 @@ <h4><a id="streams_duality" href="#streams_duality">Duality of Streams and Table
 
         <p>
         The same mechanism is used, for example, to replicate databases via change data capture (CDC) and, within Kafka Streams, to replicate its so-called state stores across machines for fault-tolerance.
-        The stream-table duality is such an important concept that Kafka Streams models it explicitly via the <a href="#streams_kstream_ktable">KStream and KTable</a> interfaces, which we describe in the next sections.
+        The stream-table duality is such an important concept that Kafka Streams models it explicitly via the <a href="#streams_kstream_ktable">KStream, KTable, and GlobalKTable</a> interfaces, which we describe in the next sections.
         </p>
 
-        <h5><a id="streams_kstream_ktable" href="#streams_kstream_ktable">KStream and KTable</a></h5>
-        The DSL uses two main abstractions. A <b>KStream</b> is an abstraction of a record stream, where each data record represents a self-contained datum in the unbounded data set.
+        <h5><a id="streams_kstream_ktable" href="#streams_kstream_ktable">KStream, KTable, and GlobalKTable</a></h5>
+        The DSL uses three main abstractions. A <b>KStream</b> is an abstraction of a record stream, where each data record represents a self-contained datum in the unbounded data set.
         A <b>KTable</b> is an abstraction of a changelog stream, where each data record represents an update. More precisely, the value in a data record is considered to be an update of the last value for the same record key,
-        if any (if a corresponding key doesn't exist yet, the update will be considered a create). To illustrate the difference between KStreams and KTables, let's imagine the following two data records are being sent to the stream:
+        if any (if a corresponding key doesn't exist yet, the update will be considered a create).
+        Like a <b>KTable</b>, a <b>GlobalKTable</b> is an abstraction of a changelog stream, where each data record represents an update.
+        However, a <b>GlobalKTable</b> is different from a <b>KTable</b> in that it is fully replicated on each KafkaStreams instance.
+        <b>GlobalKTable</b> also provides the ability to look up current values of data records by keys.
+        This table-lookup functionality is available through <a href="#streams_dsl_joins">join operations</a>.
+
+        To illustrate the difference between KStreams and KTables/GlobalKTables, let’s imagine the following two data records are being sent to the stream:
 
         <pre>
             ("alice", 1) --> ("alice", 3)
         </pre>
 
-        If these records a KStream and the stream processing application were to sum the values it would return <code>4</code>. If these records were a KTable, the return would be <code>3</code>, since the last record would be considered as an update.
+        If these records a KStream and the stream processing application were to sum the values it would return <code>4</code>. If these records were a KTable or GlobalKTable, the return would be <code>3</code>, since the last record would be considered as an update.
 
         <h4><a id="streams_dsl_source" href="#streams_dsl_source">Create Source Streams from Kafka</a></h4>
 
         <p>
-        Either a <b>record stream</b> (defined as <code>KStream</code>) or a <b>changelog stream</b> (defined as <code>KTable</code>)
-        can be created as a source stream from one or more Kafka topics (for <code>KTable</code> you can only create the source stream
+        Either a <b>record stream</b> (defined as <code>KStream</code>) or a <b>changelog stream</b> (defined as <code>KTable</code> or <code>GlobalKTable</code>)
+        can be created as a source stream from one or more Kafka topics (for <code>KTable</code> and <code>GlobalKTable</code> you can only create the source stream
         from a single topic).
         </p>
 
@@ -524,6 +530,7 @@ <h4><a id="streams_dsl_source" href="#streams_dsl_source">Create Source Streams
 
             KStream&lt;String, GenericRecord&gt; source1 = builder.stream("topic1", "topic2");
             KTable&lt;String, GenericRecord&gt; source2 = builder.table("topic3", "stateStoreName");
+            GlobalKTable&lt;String, GenericRecord&gt; source2 = builder.globalTable("topic4", "globalStoreName");
         </pre>
 
         <h4><a id="streams_dsl_windowing" href="#streams_dsl_windowing">Windowing a stream</a></h4>
@@ -551,7 +558,13 @@ <h4><a id="streams_dsl_joins" href="#streams_dsl_joins">Join multiple streams</a
         <li><b>KStream-to-KStream Joins</b> are always windowed joins, since otherwise the memory and state required to compute the join would grow infinitely in size. Here, a newly received record from one of the streams is joined with the other stream's records within the specified window interval to produce one result for each matching pair based on user-provided <code>ValueJoiner</code>. A new <code>KStream</code> instance representing the result stream of the join is returned from this operator.</li>
         
         <li><b>KTable-to-KTable Joins</b> are join operations designed to be consistent with the ones in relational databases. Here, both changelog streams are materialized into local state stores first. When a new record is received from one of the streams, it is joined with the other stream's materialized state stores to produce one result for each matching pair based on user-provided ValueJoiner. A new <code>KTable</code> instance representing the result stream of the join, which is also a changelog stream of the represented table, is returned from this operator.</li>
-        <li><b>KStream-to-KTable Joins</b> allow you to perform table lookups against a changelog stream (<code>KTable</code>) upon receiving a new record from another record stream (KStream). An example use case would be to enrich a stream of user activities (<code>KStream</code>) with the latest user profile information (<code>KTable</code>). Only records received from the record stream will trigger the join and produce results via <code>ValueJoiner</code>, not vice versa (i.e., records received from the changelog stream will be used only to update the materialized state store). A new <code>KStream</code> instance representing the result stream of the join is returned from this operator.</li>
+        <li><b>KStream-to-KTable Joins</b> allow you to perform table lookups against a changelog stream (<code>KTable</code>) upon receiving a new record from another record stream (<code>KStream</code>). An example use case would be to enrich a stream of user activities (<code>KStream</code>) with the latest user profile information (<code>KTable</code>). Only records received from the record stream will trigger the join and produce results via <code>ValueJoiner</code>, not vice versa (i.e., records received from the changelog stream will be used only to update the materialized state store). A new <code>KStream</code> instance representing the result stream of the join is returned from this operator.</li>
+        <li><b>KStream-to-GlobalKTable Joins</b> allow you to perform table lookups against a fully replicated changelog stream (<code>GlobalKTable</code>) upon receiving a new record from another record stream (<code>KStream</code>).
+            Joins with a <code>GlobalKTable</code> don't require repartitioning of the input <code>KStream</code> as all partitions of the <code>GlobalKTable</code> are available on every KafkaStreams instance.
+            The <code>KeyValueMapper</code> provided with the join operation is applied to each KStream record to extract the join-key that is used to do the lookup to the GlobalKTable so non-record-key joins are possible.
+            An example use case would be to enrich a stream of user activities (<code>KStream</code>) with the latest user profile information (<code>GlobalKTable</code>).
+            Only records received from the record stream will trigger the join and produce results via <code>ValueJoiner</code>, not vice versa (i.e., records received from the changelog stream will be used only to update the materialized state store).
+            A new <code>KStream</code> instance representing the result stream of the join is returned from this operator.</li>
         </ul>
 
         Depending on the operands the following join operations are supported: <b>inner joins</b>, <b>outer joins</b> and <b>left joins</b>. Their semantics are similar to the corresponding operators in relational databases.