This archive contains an example Maven project for Scala Spark 2 application.
The pom.xml
contains example dependencies for : -
- Spark
- SLF4J
- LOG4J (acts as logging implementation for SLF4J)
- grizzled-slf4 a Scala specific wrapper for SLF4J.
- typesafe for config.
- scalatest for testing.
Note that Scala itself is just listed as another dependency which means a global installation of Scala is not required.
The pom.xml
builds an uber-jar containing all the dependencies by default (including Scala jars).
The pom also includes two exec goals: -
exec:exec@run-local
- run the code using local spark instance.exec:exec@run-yarn
- run the code on a remote yarn cluster. In order for this to work thehive-site.xml
,core-site.xml
andyarn-site.xml
configuration files from the remote cluster must be copied into thespark-remote/conf
directory.