Update docs

Former-commit-id: dcadd83
kippfreud · Oct 5, 2019 · 375d942 · 375d942
1 parent 7838d3d
commit 375d942
Show file tree

Hide file tree

Showing 93 changed files with 1,449 additions and 1,352 deletions.
diff --git a/.gitignore b/.gitignore
@@ -1,6 +1,7 @@
 # Documentation
 docs/_build/
 docs/generated/
+docs/notebooks/.ipynb_checkpoints/
 
 # Byte-compiled / optimized / DLL files
 __pycache__/

diff --git a/.travis.yml b/.travis.yml
@@ -1,37 +1,21 @@
-sudo: false
-
 language: python
 
+python:
+  - 3.6
+  - 3.7
+
 cache:
   apt: true
   directories:
     - $HOME/.cache/pip
     - $HOME/downloads
 
+install: skip
+script: ./ci/test.sh ${TRAVIS_PYTHON_VERSION}
+
 jobs:
   include:
 
-  - stage: test
-    install: skip
-    script: ./ci/test.sh 3.6
-
-  - stage: test
-    install: skip
-    script: ./ci/test.sh 3.7.3
-
-  - stage: pypi
-    install: echo "skipping install"
-    script: echo "skipping tests"
-    deploy:
-      provider: pypi
-      user: MaxHalford
-      password:
-        secure: IKlAIBIeaM6muU5ITEM5a6BwYwDzRI5Bn30ssnS9oRRTLEqsU70Ih+2eWX5UbascgBTdtMJLdE9vEsp4F6TLNgdjmZPyYXZLb6hTDu0lz5PeFiJ4EHVNyC7PqjDpPWiU9kE1lXMAfbDADIWnK/bNvgfn5xlbUfTPnnMd1bsazJtYSOumoYpjhvEPRkC/A/+iVHfBXFvt4UqXVZw/HgcZfBze8heLGh3klAyO6QuIIbfXvf9Oiym8BnfnaPKvJi+npmitL7wMtTP/Oq/zFg0RQdBPZhcqvnwJ5c1XQDWbLJArs56IqPEk/eN3HpvDbo+bZD1Lc3ACA71bCkKDTnkhZ+0uVga9rQaue/dCQ8l+PIIHwfesv8Wb5sTA3YGaHhTFcN645AAttlvN/GHluJNuufdnp2HrTicCnsXM+9HvgvQgdNB4xVOGuinAebMyEoX6pi99dUNZXl7knFr8VTfu4YKstH59pz7PDAq5FYLpPOxpgc1+nKa0DSCj7VMKZy4G+PDbok+LKtNJj1BCKwdZFb0QQUNLLQYJfvFLJfEcArYHLb+fNgtOT87IB4TnQD2O3vNkTx7Z3/oSMaLqfIP9Jwc1q8YlLvNp7u3L04ar0V09p9xnW3qoMzWTImPZimNOjabPE6Ieoyu6N7FJsPEwAsW3S4GeSZertcy1HJLhTkU=
-      on:
-        all_branches: true
-      skip_existing: true
-    if: tag IS present
-
   - stage: docs
     install:
       - sudo apt-get install graphviz pandoc
@@ -52,6 +36,18 @@ jobs:
       - git add -A
       - git commit --allow-empty -m "Travis build number $TRAVIS_BUILD_NUMBER"
       - git push --set-upstream origin master
+
+  - stage: pypi
+    install: echo "skipping install"
+    script: echo "skipping tests"
+    deploy:
+      provider: pypi
+      user: MaxHalford
+      password:
+        secure: IKlAIBIeaM6muU5ITEM5a6BwYwDzRI5Bn30ssnS9oRRTLEqsU70Ih+2eWX5UbascgBTdtMJLdE9vEsp4F6TLNgdjmZPyYXZLb6hTDu0lz5PeFiJ4EHVNyC7PqjDpPWiU9kE1lXMAfbDADIWnK/bNvgfn5xlbUfTPnnMd1bsazJtYSOumoYpjhvEPRkC/A/+iVHfBXFvt4UqXVZw/HgcZfBze8heLGh3klAyO6QuIIbfXvf9Oiym8BnfnaPKvJi+npmitL7wMtTP/Oq/zFg0RQdBPZhcqvnwJ5c1XQDWbLJArs56IqPEk/eN3HpvDbo+bZD1Lc3ACA71bCkKDTnkhZ+0uVga9rQaue/dCQ8l+PIIHwfesv8Wb5sTA3YGaHhTFcN645AAttlvN/GHluJNuufdnp2HrTicCnsXM+9HvgvQgdNB4xVOGuinAebMyEoX6pi99dUNZXl7knFr8VTfu4YKstH59pz7PDAq5FYLpPOxpgc1+nKa0DSCj7VMKZy4G+PDbok+LKtNJj1BCKwdZFb0QQUNLLQYJfvFLJfEcArYHLb+fNgtOT87IB4TnQD2O3vNkTx7Z3/oSMaLqfIP9Jwc1q8YlLvNp7u3L04ar0V09p9xnW3qoMzWTImPZimNOjabPE6Ieoyu6N7FJsPEwAsW3S4GeSZertcy1HJLhTkU=
+      on:
+        all_branches: true
+      skip_existing: true
     if: tag IS present
 
 env:

diff --git a/CONTRIBUTING.md b/CONTRIBUTING.md
@@ -4,18 +4,24 @@
 
 ### Installation
 
-Before starting you want to make sure you have Python 3.6 or above installed.
+Before starting you want to make sure you have Python 3.6 or above installed. We recommend you create a [virtual environment](https://uoa-eresearch.github.io/eresearch-cookbook/recipe/2014/11/20/conda/) with `conda`, as so:
 
-You first want to fork the `dev` branch of the repository, which you can do from GitHub. Once you've done the fork, you can clone it to your work station. Once this is done navigate to the cloned directory and install the required dependencies:
+```sh
+conda create --name creme python=3.6 cython
+```
+
+You also will need GCC to compile Cython extensions:
 
 ```sh
-python setup.py develop
-pip install -e ".[dev]"
+conda install -c anaconda gcc
 ```
 
-:point_up: We recommend that you use [Anaconda](https://conda.io/projects/conda/en/latest/user-guide/install/index.html)
+Finally, you may fork the `dev` branch of the repository, which you can do from GitHub. Once you've done the fork, you can clone it to your work station. Once this is done navigate to the cloned directory and install the required dependencies:
 
-:point_up: We also recommend that you to use a [virtual environment](https://uoa-eresearch.github.io/eresearch-cookbook/recipe/2014/11/20/conda/)
+```sh
+pip install -e ".[dev]"
+python setup.py develop
+```
 
 ### Making changes
 
@@ -32,6 +38,18 @@ If you've added a new functionality, then you will have to write a docstring and
 - Use [Google style Python docstrings](https://www.sphinx-doc.org/en/master/usage/extensions/example_google.html#example-google)
 
 
+## Building Cython extensions
+
+```sh
+python setup.py build_ext --inplace
+```
+
+
+## Testing
+
+Simply run `pytest` execute tests. Additionally, you can test the notebooks by running `pytest --nbval-lax --current-env docs/notebooks/*.ipynb`.
+
+
 ## Making a pull request
 
 Once you're happy with your changes, you can push them to your remote fork. By the way do not hesitate to make small commits rather than one big one, it makes things easier to review. You can create a pull request to `creme`'s `master` branch.
@@ -44,6 +62,6 @@ Once you're happy with your changes, you can push them to your remote fork. By t
 The documentation is built with [Sphinx](http://www.sphinx-doc.org/en/master/).
 
 ```sh
-cd /path/to/creme/docs/
-make html
+pip install -e ".[docs]"
+make doc
 ```
diff --git a/Makefile b/Makefile
@@ -0,0 +1,5 @@
+update_nb:
+	jupyter nbconvert --execute --to notebook --inplace docs/notebooks/*.ipynb --ExecutePreprocessor.timeout=-1
+
+doc:
+	cd docs && $(MAKE) clean && rm -rf generated && python create_api_page.py && $(MAKE) html
diff --git a/README.md b/README.md
@@ -23,18 +23,18 @@
 
 <br/>
 
-`creme` is a library for online machine learning, also known as in**creme**ntal learning. Online learning is a machine learning regime where a **model learns one observation at a time**. This is in contrast to batch learning where all the data is processed in one go. Incremental learning is desirable when the data is too big to fit in memory, or simply when you want to **handle streaming data**. In addition to many online machine learning algorithms, `creme` provides utilities for **extracting features from a stream of data**. The API is heavily inspired from that of [scikit-learn](https://scikit-learn.org/stable/), meaning that users who are familiar with it should feel comfortable.
+`creme` is a library for online machine learning, also known as in**creme**ntal learning. Online learning is a machine learning regime where a **model learns one observation at a time**. This is in contrast to batch learning where all the data is processed in one go. Incremental learning is desirable when the data is too big to fit in memory, or simply when you want to **handle streaming data**. In addition to many online machine learning algorithms, `creme` provides utilities for **extracting features from a stream of data**.
 
 ## Useful links
 
 - [Documentation](https://creme-ml.github.io/)
   - [API reference](https://creme-ml.github.io/api.html)
   - [User guide](https://creme-ml.github.io/user-guide.html)
   - [FAQ](https://creme-ml.github.io/faq.html)
-- [Benchmarks](benchmarks/)
+- [Benchmarks](https://github.com/creme-ml/creme/tree/master/benchmarks)
 - [Issue tracker](https://github.com/creme-ml/creme/issues)
 - [Package releases](https://pypi.org/project/creme/#history)
-- [Change history](CHANGELOG.md)
+- [Change history](https://github.com/creme-ml/creme/blob/master/CHANGELOG.md)
 - PyData Amsterdam 2019 presentation ([slides](https://maxhalford.github.io/slides/creme-pydata/), [video](https://www.youtube.com/watch?v=P3M6dt7bY9U&list=PLGVZCDnMOq0q7_6SdrC2wRtdkojGBTAht&index=11))
 - [Toulouse Data Science presentation](https://maxhalford.github.io/slides/creme-tds/)
 
@@ -48,7 +48,9 @@
 
 You can also install the latest development version as so:
 
-    pip install git+https://github.com/creme-ml/creme --upgrade
+    pip install git+https://github.com/creme-ml/creme
+    # Or through SSH:
+    pip install git+ssh://[email protected]/creme-ml/creme.git
 
 As for dependencies, `creme` mostly relies on Python's standard library. Sometimes it relies on `numpy`, `scipy`, and `scikit-learn` to avoid reinventing the wheel.
 
@@ -175,14 +177,14 @@ We can also draw the pipeline.
 ```
 
 <div align="center">
-  <img src="docs/_static/bikes_pipeline.svg" alt="bikes_pipeline"/>
+  <img src="https://github.com/creme-ml/creme/blob/master/docs/_static/bikes_pipeline.svg" alt="bikes_pipeline"/>
 </div>
 
 By only using a few lines of code, we've built a robust model and evaluated it by simulating a production scenario. You can find a more detailed version of this example [here](https://creme-ml.github.io/notebooks/bike-sharing-forecasting.html). `creme` is a framework that has a lot to offer, and as such we kindly refer you to the [documentation](https://creme-ml.github.io/) if you want to know more.
 
 ## Contributing
 
-Like many subfields of machine learning, online learning is far from being an exact science and so there is still a lot to do. Feel free to contribute in any way you like, we're always open to new ideas and approaches. If you want to contribute to the code base please check out the [`CONTRIBUTING.md` file](CONTRIBUTING.md). Also take a look at the [issue tracker](https://github.com/creme-ml/creme/issues) and see if anything takes your fancy.
+Like many subfields of machine learning, online learning is far from being an exact science and so there is still a lot to do. Feel free to contribute in any way you like, we're always open to new ideas and approaches. If you want to contribute to the code base please check out the [CONTRIBUTING.md file](https://github.com/creme-ml/creme/blob/master/CONTRIBUTING.md). Also take a look at the [issue tracker](https://github.com/creme-ml/creme/issues) and see if anything takes your fancy.
 
 Last but not least you are more than welcome to share with us on how you're using `creme` or online learning in general! We believe that online learning solves a lot of pain points in practice, and would love to share experiences.
 

diff --git a/ci/test.sh b/ci/test.sh
@@ -29,7 +29,6 @@ source activate testenv
 
 # Install dependencies required for testing
 pip install cython
-python setup.py develop
 pip install -e ".[dev]"
 pip install codecov
 

diff --git a/creme/__init__.py b/creme/__init__.py
@@ -11,6 +11,7 @@
 
 __all__ = [
     'anomaly',
+    'base',
     'cluster',
     'compat',
     'compose',
@@ -29,7 +30,6 @@
     'naive_bayes',
     'neighbors',
     'optim',
-    'plot',
     'preprocessing',
     'proba',
     'reco',

diff --git a/creme/anomaly/__init__.py b/creme/anomaly/__init__.py
@@ -1,3 +1,4 @@
+"""Anomaly detection."""
 from .hst import HalfSpaceTrees
 
 

diff --git a/creme/anomaly/base.py b/creme/anomaly/base.py
@@ -15,7 +15,7 @@ def score_one(self, x):
         """Returns an outlier score.
 
         The range of the score depends on each model. Some models will output anomaly scores
-        between 0 and 1, others will not. In any case, the lower the score, the more likely ``x``
-        is an anomaly.
+        between 0 and 1, others will not. In any case, the lower the score, the more likely it is
+        that ``x`` is an anomaly.
 
         """
diff --git a/creme/base.py b/creme/base.py
@@ -1,6 +1,4 @@
-"""
-Base classes used throughout the library.
-"""
+"""Base interfaces."""
 import abc
 import collections
 import inspect
@@ -12,6 +10,7 @@
 __all__ = [
     'BinaryClassifier',
     'Clusterer',
+    'Ensemble',
     'Estimator',
     'Wrapper',
     'MultiClassifier',
@@ -46,6 +45,7 @@ def _update_if_consistent(dict1, dict2):
 
 
 class Estimator:
+    """An estimator."""
 
     def __str__(self):
         return self.__class__.__name__
@@ -90,7 +90,7 @@ def fit_one(self, x: dict, y: float) -> 'Regressor':
 
     @abc.abstractmethod
     def predict_one(self, x: dict) -> float:
-        """Predicts the target value of a set of features ``x``
+        """Predicts the target value of a set of features ``x``.
 
         Parameters:
             x (dict)
@@ -106,7 +106,7 @@ class Classifier(Estimator):
 
     @abc.abstractmethod
     def predict_proba_one(self, x: dict) -> Probas:
-        """Predicts the probability output of a set of features ``x``
+        """Predicts the probability output of a set of features ``x``.
 
         Parameters:
             x (dict)
@@ -117,7 +117,7 @@ def predict_proba_one(self, x: dict) -> Probas:
         """
 
     def predict_one(self, x: dict) -> Label:
-        """Predicts the target value of a set of features ``x``
+        """Predicts the target value of a set of features ``x``.
 
         Parameters:
             x (dict)
@@ -189,7 +189,7 @@ def fit_one(self, x: dict, y=None) -> 'Transformer':
 
     @abc.abstractmethod
     def transform_one(self, x: dict) -> dict:
-        """Transforms a set of features ``x``
+        """Transforms a set of features ``x``.
 
         Parameters:
             x (dict)
@@ -258,7 +258,7 @@ def fit_one(self, x: dict, y=None) -> 'Clusterer':
 
     @abc.abstractmethod
     def predict_one(self, x: dict) -> int:
-        """Predicts the cluster number of a set of features ``x``
+        """Predicts the cluster number of a set of features ``x``.
 
         Parameters:
             x (dict)
@@ -342,6 +342,7 @@ def predict_one(self, x: dict) -> typing.Dict[str, float]:
 
 
 class Wrapper(abc.ABC):
+    """A wrapper model."""
 
     @abc.abstractproperty
     def _model(self):
@@ -356,4 +357,5 @@ def __str__(self):
 
 
 class Ensemble(Estimator, collections.UserList):
+    """An ensemble model."""
     pass
diff --git a/creme/cluster/__init__.py b/creme/cluster/__init__.py
@@ -1,7 +1,4 @@
-"""
-A module for doing unsupervised clustering.
-"""
-
+"""Unsupervised clustering."""
 from .k_means import KMeans
 
 

diff --git a/creme/cluster/k_means.py b/creme/cluster/k_means.py
@@ -79,7 +79,7 @@ class KMeans(base.Clusterer):
 
     References:
         1. `Sequential k-Means Clustering <http://www.cs.princeton.edu/courses/archive/fall08/cos436/Duda/C/sk_means.htm>`_
-        2, `Web-Scale K-Means Clustering <https://www.eecs.tufts.edu/~dsculley/papers/fastkmeans.pdf>`
+        2. `Web-Scale K-Means Clustering <https://www.eecs.tufts.edu/~dsculley/papers/fastkmeans.pdf>`_
 
     """
 

diff --git a/creme/compat/__init__.py b/creme/compat/__init__.py
@@ -1,3 +1,4 @@
+"""Compatibility with other libraries."""
 from .sklearn import convert_creme_to_sklearn
 from .sklearn import convert_sklearn_to_creme
 from .sklearn import CremeClassifierWrapper
@@ -6,3 +7,15 @@
 from .sklearn import SKLClassifierWrapper
 from .sklearn import SKLClustererWrapper
 from .sklearn import SKLTransformerWrapper
+
+
+__all__ = [
+    'convert_creme_to_sklearn',
+    'convert_sklearn_to_creme',
+    'CremeClassifierWrapper',
+    'CremeRegressorWrapper',
+    'SKLRegressorWrapper',
+    'SKLClassifierWrapper',
+    'SKLClustererWrapper',
+    'SKLTransformerWrapper'
+]
diff --git a/creme/compose/__init__.py b/creme/compose/__init__.py
@@ -1,6 +1,4 @@
-"""
-Meta-estimators for building composite models.
-"""
+"""Models composition."""
 from .blacklist import Blacklister
 from .func import FuncTransformer
 from .pipeline import Pipeline

diff --git a/creme/datasets.py b/creme/datasets.py
@@ -1,3 +1,4 @@
+"""Toy datasets."""
 import ast
 import os
 import shutil

diff --git a/creme/decomposition/__init__.py b/creme/decomposition/__init__.py
@@ -1,6 +1,4 @@
-"""
-Online matrix decomposition algorithms.
-"""
+"""Online matrix decomposition."""
 from .lda import LDA
 
 

diff --git a/creme/decomposition/lda.py b/creme/decomposition/lda.py
@@ -18,7 +18,7 @@ class LDA(base.Transformer, vectorize.VectorizerMixin):
     Latent Dirichlet allocation (LDA) is a probabilistic approach for exploring topics in document
     collections. The key advantage of this variant is that it assumes an infinite vocabulary,
     meaning that the set of tokens does not have to known in advance, as opposed to the
-    `implementation from sklearn <https://scikit-learn.org/stable/modules/generated/sklearn.decomposition.LatentDirichletAllocation.html>`.
+    `implementation from sklearn <https://scikit-learn.org/stable/modules/generated/sklearn.decomposition.LatentDirichletAllocation.html>`_.
     The results produced by this implementation are identical to those from `the original
     implementation <https://github.com/kzhai/PyInfVoc>`_ proposed by the method's authors.
 

diff --git a/creme/dummy.py b/creme/dummy.py
@@ -1,6 +1,4 @@
-"""
-Dummy estimators.
-"""
+"""Dummy estimators."""
 import collections
 
 from . import base

diff --git a/creme/ensemble/__init__.py b/creme/ensemble/__init__.py
@@ -1,6 +1,4 @@
-"""
-A module for ensemble learning.
-"""
+"""Ensemble learning."""
 from .bagging import BaggingClassifier
 from .bagging import BaggingRegressor
 from .hedging import HedgeRegressor
Original file line number	Diff line number	Diff line change
		@@ -1,3 +1,4 @@
		"""Anomaly detection."""
		from .hst import HalfSpaceTrees


Expand Down