Skip to content

Commit

Permalink
Added simple app example.
Browse files Browse the repository at this point in the history
  • Loading branch information
rabidgremlin committed Sep 15, 2015
1 parent fbdf357 commit 3b41633
Show file tree
Hide file tree
Showing 2 changed files with 21 additions and 3 deletions.
13 changes: 10 additions & 3 deletions apache-spark-standalone/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -6,15 +6,22 @@ Execute
vagrant up
```

to start the VM. Then try
to start the VM, ```vagrant ssh``` into it and then try:

```
vagrant ssh
cd spark-1.5.0-bin-hadoop2.6
cd ~/spark-1.5.0-bin-hadoop2.6
./bin/pyspark
testFile = sc.textFile("README.md")
testFile.count()
```

Press ```CTRL+D``` to exit the pyspark shell

or

```
cd ~/spark-1.5.0-bin-hadoop2.6
./bin/spark-submit --master local /vagrant/simpleapp.py
```


11 changes: 11 additions & 0 deletions apache-spark-standalone/simpleapp.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,11 @@
"""simpleapp.py"""
from pyspark import SparkContext

logFile = "/home/vagrant/spark-1.5.0-bin-hadoop2.6/README.md"
sc = SparkContext(appName="Simple App")
logData = sc.textFile(logFile).cache()

numAs = logData.filter(lambda s: 'a' in s).count()
numBs = logData.filter(lambda s: 'b' in s).count()

print("Lines with a: %i, lines with b: %i" % (numAs, numBs))

0 comments on commit 3b41633

Please sign in to comment.