SPARK TUTORIAL

INTRODUCTION TO APACHE SPARK

Apache spark is a fast, in-memory data processing engine which allows data workers to effiently execute streaming, machine learning or SQL workloads that require fast iteractive access to datasets.

SPEED

It is a very critical aspect in processing large data sets as it means the difference between exploring data interactively and waiting minutes or hours.

Run computations in memory
Apache Spark has an advanced DAG execution engine that supports acyclic data flow and in-memory computing.
Enables applications in Hadoop clusters to run up to 100 times faster in memory and 10 times faster even when running on disk than MapReduce.

GENERALITY

A general programming model that enables developers to write an application by composing arbitrary operators.
Spark makes it easy to combine different processing models seamlessly in the same application.
Example:
- Data classification through Spark machine learning library.
- Streaming data through source via Spark Streamming.
- Querying the resulting data in real time through Spark SQL.

ENVIRONMENT NOTES

Spark is built on top of the Scala programming language which will be compiled to Java bytecode.
Our Spark examples use Java 8 features.

Using IntelliJ idea, just go to the folder where the code is cloned, then run

gradlew idea

RDD: Resiliant Distributed Dataset

Name		Name	Last commit message	Last commit date
Latest commit History 89 Commits
gradle/wrapper		gradle/wrapper
in		in
src/main/java/com/sparkTutorial		src/main/java/com/sparkTutorial
.gitignore		.gitignore
README.md		README.md
build.gradle		build.gradle
gradlew		gradlew
gradlew.bat		gradlew.bat

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

SPARK TUTORIAL

INTRODUCTION TO APACHE SPARK

SPEED

GENERALITY

ENVIRONMENT NOTES

About

Releases

Packages

Languages

franco148/sparkTutorial

Folders and files

Latest commit

History

Repository files navigation

SPARK TUTORIAL

INTRODUCTION TO APACHE SPARK

SPEED

GENERALITY

ENVIRONMENT NOTES

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages