Skip to content

Commit

Permalink
[FLINK-17236][docs] Add Tutorials section overview
Browse files Browse the repository at this point in the history
This closes apache#11826.
  • Loading branch information
alpinegizmo authored and NicoK committed Apr 22, 2020
1 parent 889d3b8 commit 2527cac
Show file tree
Hide file tree
Showing 9 changed files with 420 additions and 126 deletions.
37 changes: 23 additions & 14 deletions docs/concepts/index.md
Original file line number Diff line number Diff line change
@@ -1,8 +1,8 @@
---
title: Concepts
title: Concepts in Depth
nav-id: concepts
nav-pos: 2
nav-title: '<i class="fa fa-map-o title appetizer" aria-hidden="true"></i> Concepts'
nav-pos: 3
nav-title: '<i class="fa fa-map-o title appetizer" aria-hidden="true"></i> Concepts in Depth'
nav-parent_id: root
nav-show_overview: true
permalink: /concepts/index.html
Expand All @@ -27,20 +27,33 @@ specific language governing permissions and limitations
under the License.
-->

The [Hands-on Tutorials]({{ site.baseurl }}{% link tutorials/index.md %}) explain the basic concepts
of stateful and timely stream processing that underlie Flink's APIs, and provide examples of how
these mechanisms are used in applications. Stateful stream processing is introduced in the context
of [Data Pipelines & ETL]({{ site.baseurl }}{% link tutorials/etl.md %}#stateful-transformations)
and is further developed in the section on [Fault Tolerance]({{ site.baseurl }}{% link
tutorials/fault_tolerance.md %}). Timely stream processing is introduced in the section on
[Streaming Analytics]({{ site.baseurl }}{% link tutorials/streaming_analytics.md %}).

This _Concepts in Depth_ section provides a deeper understanding of how Flink's architecture and runtime
implement these concepts.

## Flink's APIs

Flink offers different levels of abstraction for developing streaming/batch applications.

<img src="{{ site.baseurl }}/fig/levels_of_abstraction.svg" alt="Programming levels of abstraction" class="offset" width="80%" />

- The lowest level abstraction simply offers **stateful streaming**. It is
- The lowest level abstraction simply offers **stateful and timely stream processing**. It is
embedded into the [DataStream API]({{ site.baseurl}}{% link
dev/datastream_api.md %}) via the [Process Function]({{ site.baseurl }}{%
link dev/stream/operators/process_function.md %}). It allows users freely
process events from one or more streams, and use consistent fault tolerant
link dev/stream/operators/process_function.md %}). It allows users to freely
process events from one or more streams, and provides consistent, fault tolerant
*state*. In addition, users can register event time and processing time
callbacks, allowing programs to realize sophisticated computations.

- In practice, most applications would not need the above described low level
abstraction, but would instead program against the **Core APIs** like the
- In practice, many applications do not need the low-level
abstractions described above, and can instead program against the **Core APIs**: the
[DataStream API]({{ site.baseurl }}{% link dev/datastream_api.md %})
(bounded/unbounded streams) and the [DataSet API]({{ site.baseurl }}{% link
dev/batch/index.md %}) (bounded data sets). These fluent APIs offer the
Expand All @@ -50,8 +63,8 @@ Flink offers different levels of abstraction for developing streaming/batch appl
respective programming languages.

The low level *Process Function* integrates with the *DataStream API*,
making it possible to go the lower level abstraction for certain operations
only. The *DataSet API* offers additional primitives on bounded data sets,
making it possible to use the lower-level abstraction on an as-needed basis.
The *DataSet API* offers additional primitives on bounded data sets,
like loops/iterations.

- The **Table API** is a declarative DSL centered around *tables*, which may
Expand All @@ -77,7 +90,3 @@ Flink offers different levels of abstraction for developing streaming/batch appl
}}{% link dev/table/index.md %}#sql) abstraction closely interacts with the
Table API, and SQL queries can be executed over tables defined in the
*Table API*.

This _concepts_ section explains the basic concepts behind the different APIs,
that is the concepts behind Flink as a stateful and timely stream processing
system.
41 changes: 25 additions & 16 deletions docs/concepts/index.zh.md
Original file line number Diff line number Diff line change
@@ -1,8 +1,8 @@
---
title: 概念
title: 概念透析
nav-id: concepts
nav-pos: 2
nav-title: '<i class="fa fa-map-o title appetizer" aria-hidden="true"></i> 概念'
nav-pos: 3
nav-title: '<i class="fa fa-map-o title appetizer" aria-hidden="true"></i> 概念透析'
nav-parent_id: root
nav-show_overview: true
permalink: /concepts/index.html
Expand All @@ -27,20 +27,33 @@ specific language governing permissions and limitations
under the License.
-->

The [Hands-on Tutorials]({{ site.baseurl }}{% link tutorials/index.zh.md %}) explain the basic concepts
of stateful and timely stream processing that underlie Flink's APIs, and provide examples of how
these mechanisms are used in applications. Stateful stream processing is introduced in the context
of [Data Pipelines & ETL]({{ site.baseurl }}{% link tutorials/etl.zh.md %}#stateful-transformations)
and is further developed in the section on [Fault Tolerance]({{ site.baseurl }}{% link
tutorials/fault_tolerance.zh.md %}). Timely stream processing is introduced in the section on
[Streaming Analytics]({{ site.baseurl }}{% link tutorials/streaming_analytics.zh.md %}).

This _Concepts in Depth_ section provides a deeper understanding of how Flink's architecture and runtime
implement these concepts.

## Flink's APIs

Flink offers different levels of abstraction for developing streaming/batch applications.

<img src="{{ site.baseurl }}/fig/levels_of_abstraction.svg" alt="Programming levels of abstraction" class="offset" width="80%" />

- The lowest level abstraction simply offers **stateful streaming**. It is
- The lowest level abstraction simply offers **stateful and timely stream processing**. It is
embedded into the [DataStream API]({{ site.baseurl}}{% link
dev/datastream_api.zh.md %}) via the [Process Function]({{ site.baseurl }}{%
link dev/stream/operators/process_function.zh.md %}). It allows users freely
process events from one or more streams, and use consistent fault tolerant
link dev/stream/operators/process_function.zh.md %}). It allows users to freely
process events from one or more streams, and provides consistent, fault tolerant
*state*. In addition, users can register event time and processing time
callbacks, allowing programs to realize sophisticated computations.

- In practice, most applications would not need the above described low level
abstraction, but would instead program against the **Core APIs** like the
- In practice, many applications do not need the low-level
abstractions described above, and can instead program against the **Core APIs**: the
[DataStream API]({{ site.baseurl }}{% link dev/datastream_api.zh.md %})
(bounded/unbounded streams) and the [DataSet API]({{ site.baseurl }}{% link
dev/batch/index.zh.md %}) (bounded data sets). These fluent APIs offer the
Expand All @@ -50,8 +63,8 @@ Flink offers different levels of abstraction for developing streaming/batch appl
respective programming languages.

The low level *Process Function* integrates with the *DataStream API*,
making it possible to go the lower level abstraction for certain operations
only. The *DataSet API* offers additional primitives on bounded data sets,
making it possible to use the lower-level abstraction on an as-needed basis.
The *DataSet API* offers additional primitives on bounded data sets,
like loops/iterations.

- The **Table API** is a declarative DSL centered around *tables*, which may
Expand All @@ -63,12 +76,12 @@ Flink offers different levels of abstraction for developing streaming/batch appl
programs declaratively define *what logical operation should be done*
rather than specifying exactly *how the code for the operation looks*.
Though the Table API is extensible by various types of user-defined
functions, it is less expressive than the *Core APIs*, but more concise to
functions, it is less expressive than the *Core APIs*, and more concise to
use (less code to write). In addition, Table API programs also go through
an optimizer that applies optimization rules before execution.

One can seamlessly convert between tables and *DataStream*/*DataSet*,
allowing programs to mix *Table API* and with the *DataStream* and
allowing programs to mix the *Table API* with the *DataStream* and
*DataSet* APIs.

- The highest level abstraction offered by Flink is **SQL**. This abstraction
Expand All @@ -77,7 +90,3 @@ Flink offers different levels of abstraction for developing streaming/batch appl
}}{% link dev/table/index.zh.md %}#sql) abstraction closely interacts with the
Table API, and SQL queries can be executed over tables defined in the
*Table API*.

This _concepts_ section explains the basic concepts behind the different APIs,
that is the concepts behind Flink as a stateful and timely stream processing
system.
96 changes: 0 additions & 96 deletions docs/concepts/stream-processing.md

This file was deleted.

Binary file added docs/fig/bounded-unbounded.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added docs/fig/flink-application-sources-sinks.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added docs/fig/local-state.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added docs/fig/parallel-job.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading

0 comments on commit 2527cac

Please sign in to comment.