Skip to content

21guns/docker-spark

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

64 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Spark docker

Docker images to:

  • Setup a standalone Apache Spark cluster running one Spark Master and multiple Spark workers
  • Build Spark applications in Java, Scala or Python to run on a Spark cluster

Currently supported versions:

  • Spark 1.5.1 for Hadoop 2.6 and later
  • Spark 1.6.2 for Hadoop 2.6 and later
  • Spark 2.0.1 for Hadoop 2.7

Using Docker Compose

Add the following services to your docker-compose.yml to integrate a Spark master and Spark worker in your BDE pipeline:

master:
  image: bde2020/spark-master:2.0.1-hadoop2.7
  hostname: spark-master
  environment:
    INIT_DAEMON_STEP: setup_spark
worker:
  image: bde2020/spark-worker:2.0.1-hadoop2.7
  links:
    - "master:spark-master"

Make sure to fill in the INIT_DAEMON_STEP as configured in your pipeline.

Running Docker containers without the init daemon

Spark Master

To start a Spark master:

docker run --name spark-master -h spark-master -e ENABLE_INIT_DAEMON=false -d bde2020/spark-master:2.0.1-hadoop2.7

Spark Worker

To start a Spark worker:

docker run --name spark-worker-1 --link spark-master:spark-master -e ENABLE_INIT_DAEMON=false -d bde2020/spark-worker:2.0.1-hadoop2.7

Launch a Spark application

Building and running your Spark application on top of the Spark cluster is as simple as extending a template Docker image. Check the template's README for further documentation.

About

Apache Spark docker image

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • Shell 100.0%