- This repo details the data mining algorithms I did in my "Data Mining with Spark" class.
- All the algorithms are written with distributed processing in mind.
- Mainly written in Spark's Python SDK: PySpark
- Most algorithms here manipulates Spark RDDs not Spark DataFrames
- With some exceptions using GraphFrames and Spark SQL
-
Notifications
You must be signed in to change notification settings - Fork 0
nealsonS/SparkDataMining
Folders and files
Name | Name | Last commit message | Last commit date | |
---|---|---|---|---|
Repository files navigation
About
Spark Data Mining assignments for my Data Mining class at USC
Resources
Stars
Watchers
Forks
Releases
No releases published
Packages 0
No packages published