Name		Name	Last commit message	Last commit date
parent directory ..
events/src		events/src
lib		lib
project		project
scripts		scripts
.gitignore		.gitignore
.scalafmt.conf		.scalafmt.conf
Makefile		Makefile
README.md		README.md
build.sbt		build.sbt
scalastyle-config.xml		scalastyle-config.xml

README.md

Introduction

This project is our collection of Scala modules.

Requirements

IntelliJ IDEA
IntelliJ Scalafmt Plugin
Scala 2.11
Spark 2.4.0+

Spark is expected to be a provided dependency, so you should have a working Spark install somewhere, and $SPARK_HOME should be set in your environment.

You should use IntelliJ IDEA (CE is fine). We use the scalafmt IntelliJ IDEA plugin, configured to update on file save, and scalastyle

Some editor config to put in place: Case Class Definition Style

We follow the Twitter Effective Scala style guide.

Saving this here for future reference: Spark + S3

Installing scala and sbt on Mac OS X

Use homebrew:

brew install [email protected]
brew install sbt

References

Spark + S3: http://deploymentzone.com/2015/12/20/s3a-on-spark-on-aws-ec2/
Spark reading from S3: https://tech.kinja.com/how-not-to-pull-from-s3-using-apache-spark-1704509219
Spark + GCS: https://stackoverflow.com/a/56400126/11295366

Notes

Getting AWS S3 to play nice with Spark is complicated, because it involves a dependency on both aws-java-sdk and hadoop-aws, and these two libraries need to be compatible versions (and compatible with Spark) or else everything explodes:

https://hadoop.apache.org/docs/current/hadoop-aws/tools/hadoop-aws/index.html#Missing_method_in_com.amazonaws_class

We currently use AWS 1.7.4 and hadoop-aws 2.7.1 as these are known to be compatible and work with Spark 2.4.0+

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

scala_modules

scala_modules

README.md

Introduction

Requirements

Installing scala and sbt on Mac OS X

References

Notes

Files

scala_modules

Directory actions

More options

Directory actions

More options

Latest commit

History

scala_modules

Folders and files

parent directory

README.md

Introduction

Requirements

Installing scala and sbt on Mac OS X

References

Notes