Skip to content

Commit

Permalink
Update docs
Browse files Browse the repository at this point in the history
Former-commit-id: dcadd83
  • Loading branch information
MaxHalford committed Oct 5, 2019
1 parent 7838d3d commit 375d942
Show file tree
Hide file tree
Showing 93 changed files with 1,449 additions and 1,352 deletions.
1 change: 1 addition & 0 deletions .gitignore
Original file line number Diff line number Diff line change
@@ -1,6 +1,7 @@
# Documentation
docs/_build/
docs/generated/
docs/notebooks/.ipynb_checkpoints/

# Byte-compiled / optimized / DLL files
__pycache__/
Expand Down
42 changes: 19 additions & 23 deletions .travis.yml
Original file line number Diff line number Diff line change
@@ -1,37 +1,21 @@
sudo: false

language: python

python:
- 3.6
- 3.7

cache:
apt: true
directories:
- $HOME/.cache/pip
- $HOME/downloads

install: skip
script: ./ci/test.sh ${TRAVIS_PYTHON_VERSION}

jobs:
include:

- stage: test
install: skip
script: ./ci/test.sh 3.6

- stage: test
install: skip
script: ./ci/test.sh 3.7.3

- stage: pypi
install: echo "skipping install"
script: echo "skipping tests"
deploy:
provider: pypi
user: MaxHalford
password:
secure: IKlAIBIeaM6muU5ITEM5a6BwYwDzRI5Bn30ssnS9oRRTLEqsU70Ih+2eWX5UbascgBTdtMJLdE9vEsp4F6TLNgdjmZPyYXZLb6hTDu0lz5PeFiJ4EHVNyC7PqjDpPWiU9kE1lXMAfbDADIWnK/bNvgfn5xlbUfTPnnMd1bsazJtYSOumoYpjhvEPRkC/A/+iVHfBXFvt4UqXVZw/HgcZfBze8heLGh3klAyO6QuIIbfXvf9Oiym8BnfnaPKvJi+npmitL7wMtTP/Oq/zFg0RQdBPZhcqvnwJ5c1XQDWbLJArs56IqPEk/eN3HpvDbo+bZD1Lc3ACA71bCkKDTnkhZ+0uVga9rQaue/dCQ8l+PIIHwfesv8Wb5sTA3YGaHhTFcN645AAttlvN/GHluJNuufdnp2HrTicCnsXM+9HvgvQgdNB4xVOGuinAebMyEoX6pi99dUNZXl7knFr8VTfu4YKstH59pz7PDAq5FYLpPOxpgc1+nKa0DSCj7VMKZy4G+PDbok+LKtNJj1BCKwdZFb0QQUNLLQYJfvFLJfEcArYHLb+fNgtOT87IB4TnQD2O3vNkTx7Z3/oSMaLqfIP9Jwc1q8YlLvNp7u3L04ar0V09p9xnW3qoMzWTImPZimNOjabPE6Ieoyu6N7FJsPEwAsW3S4GeSZertcy1HJLhTkU=
on:
all_branches: true
skip_existing: true
if: tag IS present

- stage: docs
install:
- sudo apt-get install graphviz pandoc
Expand All @@ -52,6 +36,18 @@ jobs:
- git add -A
- git commit --allow-empty -m "Travis build number $TRAVIS_BUILD_NUMBER"
- git push --set-upstream origin master

- stage: pypi
install: echo "skipping install"
script: echo "skipping tests"
deploy:
provider: pypi
user: MaxHalford
password:
secure: IKlAIBIeaM6muU5ITEM5a6BwYwDzRI5Bn30ssnS9oRRTLEqsU70Ih+2eWX5UbascgBTdtMJLdE9vEsp4F6TLNgdjmZPyYXZLb6hTDu0lz5PeFiJ4EHVNyC7PqjDpPWiU9kE1lXMAfbDADIWnK/bNvgfn5xlbUfTPnnMd1bsazJtYSOumoYpjhvEPRkC/A/+iVHfBXFvt4UqXVZw/HgcZfBze8heLGh3klAyO6QuIIbfXvf9Oiym8BnfnaPKvJi+npmitL7wMtTP/Oq/zFg0RQdBPZhcqvnwJ5c1XQDWbLJArs56IqPEk/eN3HpvDbo+bZD1Lc3ACA71bCkKDTnkhZ+0uVga9rQaue/dCQ8l+PIIHwfesv8Wb5sTA3YGaHhTFcN645AAttlvN/GHluJNuufdnp2HrTicCnsXM+9HvgvQgdNB4xVOGuinAebMyEoX6pi99dUNZXl7knFr8VTfu4YKstH59pz7PDAq5FYLpPOxpgc1+nKa0DSCj7VMKZy4G+PDbok+LKtNJj1BCKwdZFb0QQUNLLQYJfvFLJfEcArYHLb+fNgtOT87IB4TnQD2O3vNkTx7Z3/oSMaLqfIP9Jwc1q8YlLvNp7u3L04ar0V09p9xnW3qoMzWTImPZimNOjabPE6Ieoyu6N7FJsPEwAsW3S4GeSZertcy1HJLhTkU=
on:
all_branches: true
skip_existing: true
if: tag IS present

env:
Expand Down
34 changes: 26 additions & 8 deletions CONTRIBUTING.md
Original file line number Diff line number Diff line change
Expand Up @@ -4,18 +4,24 @@

### Installation

Before starting you want to make sure you have Python 3.6 or above installed.
Before starting you want to make sure you have Python 3.6 or above installed. We recommend you create a [virtual environment](https://uoa-eresearch.github.io/eresearch-cookbook/recipe/2014/11/20/conda/) with `conda`, as so:

You first want to fork the `dev` branch of the repository, which you can do from GitHub. Once you've done the fork, you can clone it to your work station. Once this is done navigate to the cloned directory and install the required dependencies:
```sh
conda create --name creme python=3.6 cython
```

You also will need GCC to compile Cython extensions:

```sh
python setup.py develop
pip install -e ".[dev]"
conda install -c anaconda gcc
```

:point_up: We recommend that you use [Anaconda](https://conda.io/projects/conda/en/latest/user-guide/install/index.html)
Finally, you may fork the `dev` branch of the repository, which you can do from GitHub. Once you've done the fork, you can clone it to your work station. Once this is done navigate to the cloned directory and install the required dependencies:

:point_up: We also recommend that you to use a [virtual environment](https://uoa-eresearch.github.io/eresearch-cookbook/recipe/2014/11/20/conda/)
```sh
pip install -e ".[dev]"
python setup.py develop
```

### Making changes

Expand All @@ -32,6 +38,18 @@ If you've added a new functionality, then you will have to write a docstring and
- Use [Google style Python docstrings](https://www.sphinx-doc.org/en/master/usage/extensions/example_google.html#example-google)


## Building Cython extensions

```sh
python setup.py build_ext --inplace
```


## Testing

Simply run `pytest` execute tests. Additionally, you can test the notebooks by running `pytest --nbval-lax --current-env docs/notebooks/*.ipynb`.


## Making a pull request

Once you're happy with your changes, you can push them to your remote fork. By the way do not hesitate to make small commits rather than one big one, it makes things easier to review. You can create a pull request to `creme`'s `master` branch.
Expand All @@ -44,6 +62,6 @@ Once you're happy with your changes, you can push them to your remote fork. By t
The documentation is built with [Sphinx](http://www.sphinx-doc.org/en/master/).

```sh
cd /path/to/creme/docs/
make html
pip install -e ".[docs]"
make doc
```
5 changes: 5 additions & 0 deletions Makefile
Original file line number Diff line number Diff line change
@@ -0,0 +1,5 @@
update_nb:
jupyter nbconvert --execute --to notebook --inplace docs/notebooks/*.ipynb --ExecutePreprocessor.timeout=-1

doc:
cd docs && $(MAKE) clean && rm -rf generated && python create_api_page.py && $(MAKE) html
14 changes: 8 additions & 6 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -23,18 +23,18 @@

<br/>

`creme` is a library for online machine learning, also known as in**creme**ntal learning. Online learning is a machine learning regime where a **model learns one observation at a time**. This is in contrast to batch learning where all the data is processed in one go. Incremental learning is desirable when the data is too big to fit in memory, or simply when you want to **handle streaming data**. In addition to many online machine learning algorithms, `creme` provides utilities for **extracting features from a stream of data**. The API is heavily inspired from that of [scikit-learn](https://scikit-learn.org/stable/), meaning that users who are familiar with it should feel comfortable.
`creme` is a library for online machine learning, also known as in**creme**ntal learning. Online learning is a machine learning regime where a **model learns one observation at a time**. This is in contrast to batch learning where all the data is processed in one go. Incremental learning is desirable when the data is too big to fit in memory, or simply when you want to **handle streaming data**. In addition to many online machine learning algorithms, `creme` provides utilities for **extracting features from a stream of data**.

## Useful links

- [Documentation](https://creme-ml.github.io/)
- [API reference](https://creme-ml.github.io/api.html)
- [User guide](https://creme-ml.github.io/user-guide.html)
- [FAQ](https://creme-ml.github.io/faq.html)
- [Benchmarks](benchmarks/)
- [Benchmarks](https://github.com/creme-ml/creme/tree/master/benchmarks)
- [Issue tracker](https://github.com/creme-ml/creme/issues)
- [Package releases](https://pypi.org/project/creme/#history)
- [Change history](CHANGELOG.md)
- [Change history](https://github.com/creme-ml/creme/blob/master/CHANGELOG.md)
- PyData Amsterdam 2019 presentation ([slides](https://maxhalford.github.io/slides/creme-pydata/), [video](https://www.youtube.com/watch?v=P3M6dt7bY9U&list=PLGVZCDnMOq0q7_6SdrC2wRtdkojGBTAht&index=11))
- [Toulouse Data Science presentation](https://maxhalford.github.io/slides/creme-tds/)

Expand All @@ -48,7 +48,9 @@

You can also install the latest development version as so:

pip install git+https://github.com/creme-ml/creme --upgrade
pip install git+https://github.com/creme-ml/creme
# Or through SSH:
pip install git+ssh://[email protected]/creme-ml/creme.git

As for dependencies, `creme` mostly relies on Python's standard library. Sometimes it relies on `numpy`, `scipy`, and `scikit-learn` to avoid reinventing the wheel.

Expand Down Expand Up @@ -175,14 +177,14 @@ We can also draw the pipeline.
```

<div align="center">
<img src="docs/_static/bikes_pipeline.svg" alt="bikes_pipeline"/>
<img src="https://github.com/creme-ml/creme/blob/master/docs/_static/bikes_pipeline.svg" alt="bikes_pipeline"/>
</div>

By only using a few lines of code, we've built a robust model and evaluated it by simulating a production scenario. You can find a more detailed version of this example [here](https://creme-ml.github.io/notebooks/bike-sharing-forecasting.html). `creme` is a framework that has a lot to offer, and as such we kindly refer you to the [documentation](https://creme-ml.github.io/) if you want to know more.

## Contributing

Like many subfields of machine learning, online learning is far from being an exact science and so there is still a lot to do. Feel free to contribute in any way you like, we're always open to new ideas and approaches. If you want to contribute to the code base please check out the [`CONTRIBUTING.md` file](CONTRIBUTING.md). Also take a look at the [issue tracker](https://github.com/creme-ml/creme/issues) and see if anything takes your fancy.
Like many subfields of machine learning, online learning is far from being an exact science and so there is still a lot to do. Feel free to contribute in any way you like, we're always open to new ideas and approaches. If you want to contribute to the code base please check out the [CONTRIBUTING.md file](https://github.com/creme-ml/creme/blob/master/CONTRIBUTING.md). Also take a look at the [issue tracker](https://github.com/creme-ml/creme/issues) and see if anything takes your fancy.

Last but not least you are more than welcome to share with us on how you're using `creme` or online learning in general! We believe that online learning solves a lot of pain points in practice, and would love to share experiences.

Expand Down
1 change: 0 additions & 1 deletion ci/test.sh
Original file line number Diff line number Diff line change
Expand Up @@ -29,7 +29,6 @@ source activate testenv

# Install dependencies required for testing
pip install cython
python setup.py develop
pip install -e ".[dev]"
pip install codecov

Expand Down
2 changes: 1 addition & 1 deletion creme/__init__.py
Original file line number Diff line number Diff line change
Expand Up @@ -11,6 +11,7 @@

__all__ = [
'anomaly',
'base',
'cluster',
'compat',
'compose',
Expand All @@ -29,7 +30,6 @@
'naive_bayes',
'neighbors',
'optim',
'plot',
'preprocessing',
'proba',
'reco',
Expand Down
1 change: 1 addition & 0 deletions creme/anomaly/__init__.py
Original file line number Diff line number Diff line change
@@ -1,3 +1,4 @@
"""Anomaly detection."""
from .hst import HalfSpaceTrees


Expand Down
4 changes: 2 additions & 2 deletions creme/anomaly/base.py
Original file line number Diff line number Diff line change
Expand Up @@ -15,7 +15,7 @@ def score_one(self, x):
"""Returns an outlier score.
The range of the score depends on each model. Some models will output anomaly scores
between 0 and 1, others will not. In any case, the lower the score, the more likely ``x``
is an anomaly.
between 0 and 1, others will not. In any case, the lower the score, the more likely it is
that ``x`` is an anomaly.
"""
18 changes: 10 additions & 8 deletions creme/base.py
Original file line number Diff line number Diff line change
@@ -1,6 +1,4 @@
"""
Base classes used throughout the library.
"""
"""Base interfaces."""
import abc
import collections
import inspect
Expand All @@ -12,6 +10,7 @@
__all__ = [
'BinaryClassifier',
'Clusterer',
'Ensemble',
'Estimator',
'Wrapper',
'MultiClassifier',
Expand Down Expand Up @@ -46,6 +45,7 @@ def _update_if_consistent(dict1, dict2):


class Estimator:
"""An estimator."""

def __str__(self):
return self.__class__.__name__
Expand Down Expand Up @@ -90,7 +90,7 @@ def fit_one(self, x: dict, y: float) -> 'Regressor':

@abc.abstractmethod
def predict_one(self, x: dict) -> float:
"""Predicts the target value of a set of features ``x``
"""Predicts the target value of a set of features ``x``.
Parameters:
x (dict)
Expand All @@ -106,7 +106,7 @@ class Classifier(Estimator):

@abc.abstractmethod
def predict_proba_one(self, x: dict) -> Probas:
"""Predicts the probability output of a set of features ``x``
"""Predicts the probability output of a set of features ``x``.
Parameters:
x (dict)
Expand All @@ -117,7 +117,7 @@ def predict_proba_one(self, x: dict) -> Probas:
"""

def predict_one(self, x: dict) -> Label:
"""Predicts the target value of a set of features ``x``
"""Predicts the target value of a set of features ``x``.
Parameters:
x (dict)
Expand Down Expand Up @@ -189,7 +189,7 @@ def fit_one(self, x: dict, y=None) -> 'Transformer':

@abc.abstractmethod
def transform_one(self, x: dict) -> dict:
"""Transforms a set of features ``x``
"""Transforms a set of features ``x``.
Parameters:
x (dict)
Expand Down Expand Up @@ -258,7 +258,7 @@ def fit_one(self, x: dict, y=None) -> 'Clusterer':

@abc.abstractmethod
def predict_one(self, x: dict) -> int:
"""Predicts the cluster number of a set of features ``x``
"""Predicts the cluster number of a set of features ``x``.
Parameters:
x (dict)
Expand Down Expand Up @@ -342,6 +342,7 @@ def predict_one(self, x: dict) -> typing.Dict[str, float]:


class Wrapper(abc.ABC):
"""A wrapper model."""

@abc.abstractproperty
def _model(self):
Expand All @@ -356,4 +357,5 @@ def __str__(self):


class Ensemble(Estimator, collections.UserList):
"""An ensemble model."""
pass
5 changes: 1 addition & 4 deletions creme/cluster/__init__.py
Original file line number Diff line number Diff line change
@@ -1,7 +1,4 @@
"""
A module for doing unsupervised clustering.
"""

"""Unsupervised clustering."""
from .k_means import KMeans


Expand Down
2 changes: 1 addition & 1 deletion creme/cluster/k_means.py
Original file line number Diff line number Diff line change
Expand Up @@ -79,7 +79,7 @@ class KMeans(base.Clusterer):
References:
1. `Sequential k-Means Clustering <http://www.cs.princeton.edu/courses/archive/fall08/cos436/Duda/C/sk_means.htm>`_
2, `Web-Scale K-Means Clustering <https://www.eecs.tufts.edu/~dsculley/papers/fastkmeans.pdf>`
2. `Web-Scale K-Means Clustering <https://www.eecs.tufts.edu/~dsculley/papers/fastkmeans.pdf>`_
"""

Expand Down
13 changes: 13 additions & 0 deletions creme/compat/__init__.py
Original file line number Diff line number Diff line change
@@ -1,3 +1,4 @@
"""Compatibility with other libraries."""
from .sklearn import convert_creme_to_sklearn
from .sklearn import convert_sklearn_to_creme
from .sklearn import CremeClassifierWrapper
Expand All @@ -6,3 +7,15 @@
from .sklearn import SKLClassifierWrapper
from .sklearn import SKLClustererWrapper
from .sklearn import SKLTransformerWrapper


__all__ = [
'convert_creme_to_sklearn',
'convert_sklearn_to_creme',
'CremeClassifierWrapper',
'CremeRegressorWrapper',
'SKLRegressorWrapper',
'SKLClassifierWrapper',
'SKLClustererWrapper',
'SKLTransformerWrapper'
]
4 changes: 1 addition & 3 deletions creme/compose/__init__.py
Original file line number Diff line number Diff line change
@@ -1,6 +1,4 @@
"""
Meta-estimators for building composite models.
"""
"""Models composition."""
from .blacklist import Blacklister
from .func import FuncTransformer
from .pipeline import Pipeline
Expand Down
1 change: 1 addition & 0 deletions creme/datasets.py
Original file line number Diff line number Diff line change
@@ -1,3 +1,4 @@
"""Toy datasets."""
import ast
import os
import shutil
Expand Down
4 changes: 1 addition & 3 deletions creme/decomposition/__init__.py
Original file line number Diff line number Diff line change
@@ -1,6 +1,4 @@
"""
Online matrix decomposition algorithms.
"""
"""Online matrix decomposition."""
from .lda import LDA


Expand Down
2 changes: 1 addition & 1 deletion creme/decomposition/lda.py
Original file line number Diff line number Diff line change
Expand Up @@ -18,7 +18,7 @@ class LDA(base.Transformer, vectorize.VectorizerMixin):
Latent Dirichlet allocation (LDA) is a probabilistic approach for exploring topics in document
collections. The key advantage of this variant is that it assumes an infinite vocabulary,
meaning that the set of tokens does not have to known in advance, as opposed to the
`implementation from sklearn <https://scikit-learn.org/stable/modules/generated/sklearn.decomposition.LatentDirichletAllocation.html>`.
`implementation from sklearn <https://scikit-learn.org/stable/modules/generated/sklearn.decomposition.LatentDirichletAllocation.html>`_.
The results produced by this implementation are identical to those from `the original
implementation <https://github.com/kzhai/PyInfVoc>`_ proposed by the method's authors.
Expand Down
4 changes: 1 addition & 3 deletions creme/dummy.py
Original file line number Diff line number Diff line change
@@ -1,6 +1,4 @@
"""
Dummy estimators.
"""
"""Dummy estimators."""
import collections

from . import base
Expand Down
4 changes: 1 addition & 3 deletions creme/ensemble/__init__.py
Original file line number Diff line number Diff line change
@@ -1,6 +1,4 @@
"""
A module for ensemble learning.
"""
"""Ensemble learning."""
from .bagging import BaggingClassifier
from .bagging import BaggingRegressor
from .hedging import HedgeRegressor
Expand Down
Loading

0 comments on commit 375d942

Please sign in to comment.