Maureen is a distributed machine learning/data mining library for Python. It is built on top of the mrjob library which provides a MapReduce framework in Python.
Maureen means "star of the sea" in Gaelic.
Maureen is currently in pre-release development.
Goals for the 0.1 release:
- install not difficult -done
- one ML algo working on a distributed system
Feature Request List
- Recommendation engines
- binary ratings - done
- numerical ratings
- performance metrics
- Clustering
- Canopy Clustering
- LDA
- Graph Analysis
- graph partitioning
- Classification
- SVM
- Naive Bayes
- Adapters to common datasets
- Movie Lens
- Reuters-21578 corpus
- Wikipedia snapshot
- US Census
- Search
- Lucene style index needed for LDA
>python setup.py install