# save model to file
$SPARK_HOME/bin/spark-submit spark_training.py
- shortcut for streaming demo
bash run_streaming.sh
python flask/run.py
- Start zookeeper & Kafka
sudo /usr/local/zookeeper/bin/zkServer.sh start
sudo /usr/local/kafka/bin/kafka-server-start.sh /usr/local/kafka/config/server.properties &
- Start Redis
redis-server
- Start Spark Streaming
$SPARK_HOME/bin/spark-submit stream_predict.py
- Start Kafka Producer
python auto_producer.py
- Start flask app
cd flask
python run.py
# in the browser
localhost:5000
- Stop zookeeper & Kafka
sudo /usr/local/kafka/bin/kafka-server-stop.sh
sudo /usr/local/zookeeper/bin/zkServer.sh stop
-
Scala 2.11
-
Spark 2.0.1
-
Zookeeper 3.4.8
-
Kafka 0.9
-
Redis 2.6.9
-
for Spark Streaming
- spark-streaming-kafka-0-8_2.11-2.0.1.jar
- kafka_2.11-0.8.2.2.jar
- metrics-core-2.2.0.jar
- kafka_2.11-0.8.2.2.jar
- spark-streaming-kafka-0-8_2.11-2.0.1.jar
-
for Python (pip install)
- kafka-python==1.0.2
- redis==2.10.5
-
Spark 2.0.1
Download: http://spark.apache.org/downloads.html
-
Others
For installing and configuring Zookeeper, Kafka and Redis, please refer to install_tools.sh
Mohammad, Rami, McCluskey, T.L. and Thabtah, Fadi Abdeljaber. (2015). Phishing Websites Data Set. UCI Machine Learning Repository [https://archive.ics.uci.edu/ml/datasets/Phishing+Websites#].