flightstats

Building the project

Get JDK. Tested with 1.7.0_55 and 1.8.0_91 on OSX
Execute ./gradlew clean shadowJar (or ./gradlew.bat clean shadowJar on Windows)

Execute ./gradlew test (or ./gradlew.bat test on Windows)

Download csv files from http://stat-computing.org/dataexpo/2009/the-data.html and uncompress them.
Download Spark distribution from http://spark.apache.org/downloads.html. Tested with spark-2.0.0-bin-hadoop2.7.tgz only.
Build the project.
Execute: spark-2.0.0-bin-hadoop2.7/bin/spark-submit --master "local[*]" --class com.github.saulius.flightstats.JobRunner build/libs/flightstats-all.jar com.github.saulius.flightstats.jobs.ArrivalDelayPredictionJob data Assuming here that Spark was downloaded to the project directory and the data resides in data directory on project root.

Name		Name	Last commit message	Last commit date
Latest commit History 7 Commits
src		src
.gitignore		.gitignore
README.md		README.md
build.gradle		build.gradle
gradlew		gradlew
gradlew.bat		gradlew.bat
settings.gradle		settings.gradle