Run the following to build PredictionIO and its binary distribution.
$ ./make-distribution.sh
You should see something like the following when it finishes building successfully.
...
a imagine/bin/run-eval
a imagine/bin/run-server
a imagine/bin/run-train
PredictionIO binary distribution created at imagine.tar.gz
Download Spark's pre-built "For Hadoop 2 (HDP2, CDH5)" package. Unzip the file.
Set the $SPARK_HOME
shell variable to the path of the unzipped Spark directory:
$ export SPARK_HOME=<your spark directory>
For example,
$ export SPARK_HOME=/Users/abc/Downloads/spark-1.0.1-bin-hadoop2
PredictionIO relies on a data store to store its metadata. At the moment, PredictionIO's storage layer supports both Elasticsearch and MongoDB. Make sure you have one of these running and functioning properly on your computer.
-
Copy
conf/pio-env.sh.template
toconf/pio-env.sh
. -
If you are using Elasticsearch and its default settings, you may stop here.
-
Otherwise, change the following to fit your setup.
PIO_STORAGE_SOURCES_ELASTICSEARCH_TYPE=elasticsearch PIO_STORAGE_SOURCES_ELASTICSEARCH_HOSTS=localhost PIO_STORAGE_SOURCES_ELASTICSEARCH_PORTS=9300
If you use MongoDB, add and modify the following to fit your setup.
PIO_STORAGE_SOURCES_MONGODB_TYPE=mongodb PIO_STORAGE_SOURCES_MONGODB_HOSTS=localhost PIO_STORAGE_SOURCES_MONGODB_PORTS=27017
-
The following points the storage repositories to their respective backend data sources. By default, they point to Elasticsearch.
PIO_STORAGE_REPOSITORIES_METADATA_SOURCE=ELASTICSEARCH PIO_STORAGE_REPOSITORIES_APPDATA_SOURCE=ELASTICSEARCH
If you use MongoDB, change them to something like this.
PIO_STORAGE_REPOSITORIES_METADATA_SOURCE=MONGODB PIO_STORAGE_REPOSITORIES_APPDATA_SOURCE=MONGODB
-
Save
conf/pio-env.sh
and you are done!
The purpose of the tutorials 1-4 is to help you to get familiar with each components of the PredictionIO framework.
- Tutorial 1 - Develop and Integrate Algorithm with PredictionIO
- Tutorial 2 - Test Engine Components
- Tutorial 3 - Evaluation
- Tutorial 4 - Multiple Algorithms Engine
More interesting tutorials:
- Stock Prediction Engine with Customizable Algorithms
- Linear Regression Engine
- Distributed Recommendation Engine with RDD-based Model using MLlib's ALS
-
Run this command.
$ sbt/sbt unidoc
-
Point your web browser at
target/scala-2.10/unidoc/index.html
for ScalaDoc, ortarget/javaunidoc/index.html
for Javadoc.