Skip to content

Time series anomaly detection algorithm implementations for TimeEval (Docker-based)

Notifications You must be signed in to change notification settings

TimeEval/TimeEval-algorithms

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

20 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

TimeEval logo

TimeEval Algorithms

Time Series Anomaly Detection Algorithms for TimeEval.

Description

This repository contains a collection of containerized (dockerized) time series anomaly detection methods that can easily be evaluated using TimeEval. Some of the algorithm's source code is access restricted and we just provide the TimeEval stubs and manifests. We are happy to share our TimeEval adaptations of excluded algorithms upon request, if the original authors approve this.

Each folder contains the implementation of an algorithm that will be build into a runnable Docker container using CI. The namespace prefix (repository) for the built Docker images is registry.gitlab.hpi.de/akita/i/.

Overview

Algorithm (folder) Image Language Base image Learning Type Input Dimensionality
arima (restricted access) registry.gitlab.hpi.de/akita/i/arima python 3.7 registry.gitlab.hpi.de/akita/i/python3-base unsupervised univariate
autoencoder registry.gitlab.hpi.de/akita/i/autoencoder python 3.7 registry.gitlab.hpi.de/akita/i/python3-base semi-supervised multivariate
bagel registry.gitlab.hpi.de/akita/i/bagel python 3.7 registry.gitlab.hpi.de/akita/i/python3-base semi-supervised univariate
baseline_increasing registry.gitlab.hpi.de/akita/i/baseline_increasing Python 3.7 registry.gitlab.hpi.de/akita/i/python3-base unsupervised multivariate
baseline_normal registry.gitlab.hpi.de/akita/i/baseline_normal Python 3.7 registry.gitlab.hpi.de/akita/i/python3-base unsupervised multivariate
baseline_random registry.gitlab.hpi.de/akita/i/baseline_random Python 3.7 registry.gitlab.hpi.de/akita/i/python3-base unsupervised multivariate
cblof registry.gitlab.hpi.de/akita/i/cblof python 3.7 registry.gitlab.hpi.de/akita/i/pyod -> registry.gitlab.hpi.de/akita/i/python3-base unsupervised multivariate
cof registry.gitlab.hpi.de/akita/i/cof python 3.7 registry.gitlab.hpi.de/akita/i/pyod -> registry.gitlab.hpi.de/akita/i/python3-base unsupervised multivariate
copod registry.gitlab.hpi.de/akita/i/copod python 3.7 registry.gitlab.hpi.de/akita/i/pyod -> registry.gitlab.hpi.de/akita/i/python3-base unsupervised multivariate
dae (DeNoising Autoencoder) registry.gitlab.hpi.de/akita/i/dae python 3.7 registry.gitlab.hpi.de/akita/i/python3-base semi-supervised multivariate
damp registry.gitlab.hpi.de/akita/i/damp Python 3.7 registry.gitlab.hpi.de/akita/i/python3-base unsupervised multivariate
dbstream registry.gitlab.hpi.de/akita/i/dbstream R 4.0.5 registry.gitlab.hpi.de/akita/i/r4-base unsupervised multivariate
deepant registry.gitlab.hpi.de/akita/i/deepant python 3.7 registry.gitlab.hpi.de/akita/i/python3-torch -> registry.gitlab.hpi.de/akita/i/python3-base semi-supervised multivariate
deepnap registry.gitlab.hpi.de/akita/i/deepnap python 3.7 registry.gitlab.hpi.de/akita/i/python3-torch -> registry.gitlab.hpi.de/akita/i/python3-base semi-supervised multivariate
donut registry.gitlab.hpi.de/akita/i/donut python 3.7 registry.gitlab.hpi.de/akita/i/python3-base semi-supervised univariate
dspot registry.gitlab.hpi.de/akita/i/dspot python 3.7 registry.gitlab.hpi.de/akita/i/python3-base unsupervised univariate
dwt_mlead registry.gitlab.hpi.de/akita/i/dwt_mlead python 3.7 registry.gitlab.hpi.de/akita/i/python3-base unsupervised univariate
eif registry.gitlab.hpi.de/akita/i/eif python 3.7 registry.gitlab.hpi.de/akita/i/python3-base unsupervised multivariate
encdec_ad registry.gitlab.hpi.de/akita/i/encdec_ad python 3.7 registry.gitlab.hpi.de/akita/i/python3-torch -> registry.gitlab.hpi.de/akita/i/python3-base semi-supervised multivariate
ensemble_gi registry.gitlab.hpi.de/akita/i/ensemble_gi Python 3.7 registry.gitlab.hpi.de/akita/i/python3-base unsupervised univariate
fast_mcd registry.gitlab.hpi.de/akita/i/fast_mcd Python 3.7 registry.gitlab.hpi.de/akita/i/python3-base semi-supervised multivariate
fft registry.gitlab.hpi.de/akita/i/fft python 3.7 registry.gitlab.hpi.de/akita/i/python3-base unsupervised univariate
generic_rf registry.gitlab.hpi.de/akita/i/generic_rf python 3.7 registry.gitlab.hpi.de/akita/i/python3-base semi-supervised univariate
generic_xgb registry.gitlab.hpi.de/akita/i/generic_xgb python 3.7 registry.gitlab.hpi.de/akita/i/python3-base semi-supervised univariate
grammarviz3 registry.gitlab.hpi.de/akita/i/grammarviz3 Java registry.gitlab.hpi.de/akita/i/java-base unsupervised univariate
hbos registry.gitlab.hpi.de/akita/i/hbos python 3.7 registry.gitlab.hpi.de/akita/i/pyod -> registry.gitlab.hpi.de/akita/i/python3-base unsupervised multivariate
health_esn registry.gitlab.hpi.de/akita/i/health_esn Python 3.7 registry.gitlab.hpi.de/akita/i/python3-base semi-supervised multivariate
hif registry.gitlab.hpi.de/akita/i/hif python 3.7 registry.gitlab.hpi.de/akita/i/python3-base supervised multivariate
hotsax registry.gitlab.hpi.de/akita/i/hotsax python 3.7 registry.gitlab.hpi.de/akita/i/python3-base unsupervised univariate
hybrid_knn registry.gitlab.hpi.de/akita/i/hybrid_knn python 3.7 registry.gitlab.hpi.de/akita/i/python3-torch -> registry.gitlab.hpi.de/akita/i/python3-base semi-supervised multivariate
if_lof registry.gitlab.hpi.de/akita/i/if_lof python 3.7 registry.gitlab.hpi.de/akita/i/python3-base unsupervised multivariate
iforest registry.gitlab.hpi.de/akita/i/iforest python 3.7 registry.gitlab.hpi.de/akita/i/pyod -> registry.gitlab.hpi.de/akita/i/python3-base unsupervised multivariate
img_embedding_cae registry.gitlab.hpi.de/akita/i/img_embedding_cae python 3.7 registry.gitlab.hpi.de/akita/i/python3-torch -> registry.gitlab.hpi.de/akita/i/python3-base semi-supervised univariate
kmeans registry.gitlab.hpi.de/akita/i/kmeans Python 3.7 registry.gitlab.hpi.de/akita/i/python3-base unsupervised multivariate
knn registry.gitlab.hpi.de/akita/i/knn python 3.7 registry.gitlab.hpi.de/akita/i/pyod -> registry.gitlab.hpi.de/akita/i/python3-base unsupervised multivariate
laser_dbn registry.gitlab.hpi.de/akita/i/laser_dbn Python 3.7 registry.gitlab.hpi.de/akita/i/python3-base semi-supervised multivariate
left_stampi registry.gitlab.hpi.de/akita/i/left_stampi Python 3.7 registry.gitlab.hpi.de/akita/i/python3-base unsupervised univariate
lof registry.gitlab.hpi.de/akita/i/lof python 3.7 registry.gitlab.hpi.de/akita/i/pyod -> registry.gitlab.hpi.de/akita/i/python3-base unsupervised multivariate
lstm_ad registry.gitlab.hpi.de/akita/i/lstm_ad python 3.7 registry.gitlab.hpi.de/akita/i/python3-torch -> registry.gitlab.hpi.de/akita/i/python3-base semi-supervised multivariate
lstm_vae registry.gitlab.hpi.de/akita/i/lstm_vae python 3.7 registry.gitlab.hpi.de/akita/i/python3-torch -> registry.gitlab.hpi.de/akita/i/python3-base semi-supervised univariate
median_method registry.gitlab.hpi.de/akita/i/median_method python 3.7 registry.gitlab.hpi.de/akita/i/python3-base unsupervised univariate
mscred registry.gitlab.hpi.de/akita/i/mscred python 3.7 registry.gitlab.hpi.de/akita/i/python3-torch -> registry.gitlab.hpi.de/akita/i/python3-base semi-supervised multivariate
mstamp registry.gitlab.hpi.de/akita/i/mstamp python 3.7 registry.gitlab.hpi.de/akita/i/python3-base unsupervised multivariate
mtad_gat registry.gitlab.hpi.de/akita/i/mtad_gat python 3.7 registry.gitlab.hpi.de/akita/i/python3-torch -> registry.gitlab.hpi.de/akita/i/python3-base semi-supervised multivariate
multi_hmm registry.gitlab.hpi.de/akita/i/multi_hmm Python 3.7 registry.gitlab.hpi.de/akita/i/python3-base supervised multivariate
multi_subsequence_lof registry.gitlab.hpi.de/akita/i/multi_subsequence_lof python 3.7 registry.gitlab.hpi.de/akita/i/pyod -> registry.gitlab.hpi.de/akita/i/python3-base unsupervised multivariate
mvalmod registry.gitlab.hpi.de/akita/i/mvalmod R 3.5.2 registry.gitlab.hpi.de/akita/i/tsmp -> registry.gitlab.hpi.de/akita/i/r-base unsupervised multivariate
norma (restricted access) registry.gitlab.hpi.de/akita/i/norma Python 3.7 registry.gitlab.hpi.de/akita/i/python3-base unsupervised univariate
normalizing_flows registry.gitlab.hpi.de/akita/i/normalizing_flows python 3.7 registry.gitlab.hpi.de/akita/i/python3-torch -> registry.gitlab.hpi.de/akita/i/python3-base supervised multivariate
novelty_svr registry.gitlab.hpi.de/akita/i/novelty_svr python 3.7 registry.gitlab.hpi.de/akita/i/python3-base unsupervised univariate
numenta_htm registry.gitlab.hpi.de/akita/i/numenta_htm Python 2.7 registry.gitlab.hpi.de/akita/i/python2-base unsupervised univariate
ocean_wnn registry.gitlab.hpi.de/akita/i/ocean_wnn python 3.7 registry.gitlab.hpi.de/akita/i/python3-torch -> registry.gitlab.hpi.de/akita/i/python3-base semi-supervised univariate
omnianomaly registry.gitlab.hpi.de/akita/i/omnianomaly Python 3.6 registry.gitlab.hpi.de/akita/i/python36-base semi-supervised multivariate
pcc registry.gitlab.hpi.de/akita/i/pcc Python 3.7 registry.gitlab.hpi.de/akita/i/pyod -> registry.gitlab.hpi.de/akita/i/python3-base unsupervised multivariate
pci registry.gitlab.hpi.de/akita/i/pci Python 3.7 registry.gitlab.hpi.de/akita/i/python3-base unsupervised univariate
phasespace_svm registry.gitlab.hpi.de/akita/i/phasespace_svm python 3.7 registry.gitlab.hpi.de/akita/i/python3-base unsupervised univariate
pst registry.gitlab.hpi.de/akita/i/pst R 3.5.2 registry.gitlab.hpi.de/akita/i/r-base
random_black_forest registry.gitlab.hpi.de/akita/i/random_black_forest python 3.7 registry.gitlab.hpi.de/akita/i/python3-base semi-supervised multivariate
robust_pca registry.gitlab.hpi.de/akita/i/robust_pca Python 3.7 registry.gitlab.hpi.de/akita/i/python3-base semi-supervised multivariate
sand (restricted access) registry.gitlab.hpi.de/akita/i/sand python 3.7 registry.gitlab.hpi.de/akita/i/python3-base unsupervised univariate
sarima registry.gitlab.hpi.de/akita/i/sarima python 3.7 registry.gitlab.hpi.de/akita/i/python3-base unsupervised univariate
series2graph (restricted access) registry.gitlab.hpi.de/akita/i/series2graph python 3.7 registry.gitlab.hpi.de/akita/i/python3-base unsupervised univariate
s_h_esd registry.gitlab.hpi.de/akita/i/s_h_esd python 3.7 registry.gitlab.hpi.de/akita/i/python3-base unsupervised univariate
sr registry.gitlab.hpi.de/akita/i/sr Python 3.7 registry.gitlab.hpi.de/akita/i/python3-base unsupervised univariate
sr_cnn registry.gitlab.hpi.de/akita/i/sr_cnn Python 3.7 registry.gitlab.hpi.de/akita/i/python3-torch semi-supervised univariate
ssa (restricted access) registry.gitlab.hpi.de/akita/i/ssa python 3.7 registry.gitlab.hpi.de/akita/i/python3-base semi-supervised univariate
stamp registry.gitlab.hpi.de/akita/i/stamp R 3.5.2 registry.gitlab.hpi.de/akita/i/tsmp -> registry.gitlab.hpi.de/akita/i/r-base unsupervised univariate
stomp registry.gitlab.hpi.de/akita/i/stomp R 3.5.2 registry.gitlab.hpi.de/akita/i/tsmp -> registry.gitlab.hpi.de/akita/i/r-base unsupervised univariate
subsequence_fast_mcd registry.gitlab.hpi.de/akita/i/subsequence_fast_mcd python 3.7 registry.gitlab.hpi.de/akita/i/python3-base semi-supervised univariate
subsequence_knn registry.gitlab.hpi.de/akita/i/subsequence_knn python 3.7 registry.gitlab.hpi.de/akita/i/pyod -> registry.gitlab.hpi.de/akita/i/python3-base unsupervised multivariate
subsequence_if registry.gitlab.hpi.de/akita/i/subsequence_if python 3.7 registry.gitlab.hpi.de/akita/i/pyod -> registry.gitlab.hpi.de/akita/i/python3-base unsupervised univariate
subsequence_lof registry.gitlab.hpi.de/akita/i/subsequence_lof python 3.7 registry.gitlab.hpi.de/akita/i/pyod -> registry.gitlab.hpi.de/akita/i/python3-base unsupervised univariate
tanogan registry.gitlab.hpi.de/akita/i/tanogan python 3.7 registry.gitlab.hpi.de/akita/i/python3-torch -> registry.gitlab.hpi.de/akita/i/python3-base semi-supervised multivariate
tarzan registry.gitlab.hpi.de/akita/i/tarzan Python 3.7 registry.gitlab.hpi.de/akita/i/python3-torch semi-supervised univariate
telemanom registry.gitlab.hpi.de/akita/i/telemanom python 3.7 registry.gitlab.hpi.de/akita/i/python3-base semi-supervised multivariate
torsk registry.gitlab.hpi.de/akita/i/torsk python 3.7 registry.gitlab.hpi.de/akita/i/python3-base unsupervised multivariate
triple_es registry.gitlab.hpi.de/akita/i/triple_es python 3.7 registry.gitlab.hpi.de/akita/i/python3-base unsupervised univariate
ts_bitmap registry.gitlab.hpi.de/akita/i/ts_bitmap python 3.7 registry.gitlab.hpi.de/akita/i/python3-base unsupervised univariate
valmod registry.gitlab.hpi.de/akita/i/valmod R 3.5.2 registry.gitlab.hpi.de/akita/i/tsmp -> registry.gitlab.hpi.de/akita/i/r-base unsupervised univariate

Usage

Prerequisites

You need the following tools installed on your development machine:

  • git
  • docker
  • access to this repository

Please make yourself familiar with the contents of this repository and read this document carefully!

Testing an algorithm and its TimeEval integration

Testing an algorithm locally can be done in two different ways:

  1. Test the algorithm's code directly (using the tools provided by the programming language)
  2. Test the algorithm within its docker container

The first option is specific to the programming language, so we won't cover it here.

Each algorithm in this repository will be bundled in a self-contained Docker image so that it can be executed with a single command and no additional dependencies must be installed. This allows you to test the algorithm without installing its dependencies on your machine. The only requirement is a (x86-)Docker runtime. Follow the below steps to test your algorithm using Docker:

  1. Prepare base image You'll need the required base Docker image to build your algorithm's image. If you find yourself situated in the HPI network (either VPN or physically), you are able to pull the docker images from our docker repository registry.gitlab.hpi.de/akita/i/. If this is not the case you have to build the base images yourself as follows:

    • change to the 0-base-images folder: cd 0-base-images
    • build your desired base image, e.g. docker build -t registry.gitlab.hpi.de/akita/i/python3-base:0.2.5 ./python3-base
    • (optionally: build derived base image, e.g. docker build -t registry.gitlab.hpi.de/akita/i/pyod:0.2.5 ./pyod)
    • now you can build your algorithm image from the base image (see next item)
  2. Build algorithm image Next, you'll need to build the algorithm's Docker image. It is based on the previously built base image and contains the algorithm-specific source code.

    • Change to the root directory of the timeeval-algorithms-repository.
    • build the algorithm image, e.g. docker build -t registry.gitlab.hpi.de/akita/i/lof ./lof
  3. Train your algorithm (optional) If your algorithm is supervised or semi-supervised, execute the following command to perform the training step:

    mkdir -p 2-results
    docker run --rm \
        -v $(pwd)/1-data:/data:ro \
        -v $(pwd)/2-results:/results:rw \
    #    -e LOCAL_UID=<current user id> \
    #    -e LOCAL_GID=<current groupid> \
      registry.gitlab.hpi.de/akita/i/<your_algorithm>:latest execute-algorithm '{ \
        "executionType": "train", \
        "dataInput": "/data/dataset.csv", \
        "dataOutput": "/results/anomaly_scores.ts", \
        "modelInput": "/results/model.pkl", \
        "modelOutput": "/results/model.pkl", \
        "customParameters": {} \
      }'

    Be warned that the result and model files will be written to the 2-results-directory as the root-user if you do no pass the optional environment variables LOCAL_UID and LOCAL_GID to the container.

  4. Execute your algorithm Run the following command to perform the execution step of your algorithm:

    mkdir -p 2-results
    docker run --rm \
        -v $(pwd)/1-data:/data:ro \
        -v $(pwd)/2-results:/results:rw \
    #    -e LOCAL_UID=<current user id> \
    #    -e LOCAL_GID=<current groupid> \
      registry.gitlab.hpi.de/akita/i/<your_algorithm>:latest execute-algorithm '{ \
        "executionType": "execute", \
        "dataInput": "/data/dataset.csv", \
        "dataOutput": "/results/anomaly_scores.ts", \
        "modelInput": "/results/model.pkl", \
        "modelOutput": "/results/model.pkl", \
        "customParameters": {} \
      }'

    Be warned that the result and model files will be written to the 2-results-directory as the root-user if you do no pass the optional environment variables LOCAL_UID and LOCAL_GID to the container.

Additional information

TimeEval base Docker images

To benefit from Docker layer caching and to reduce code duplication (DRY!), we decided to put common functionality in so-called base images. The following is taken care of by base images:

  • Provide system (OS and common OS tools)
  • Provide language runtime (e.g. python3, java8)
  • Provide common libraries / algorithm dependencies
  • Define volumes for IO
  • Define Docker entrypoint script (performs initial container setup before the algorithm is executed)

Please consider the folder 0-base-images for all available base images.

TimeEval algorithm interface

TimeEval uses a common interface to execute all the algorithms in this repository. This means that the algorithms' input, output, and parametrization is equal for all TimeEval algorithms.

Execution and parametrization

All algorithms are executed by creating a Docker container using their Docker image and running it. The base images take care of the container startup and they call the main algorithm file with a single positional parameter. This parameter contains a String-representation of the algorithm configuration JSON. Example parameter JSON (2021-02-03):

{
  "executionType": 'train' | 'execute',
  "dataInput": string,   # example: "path/to/dataset.csv",
  "dataOutput": string,  # example: "path/to/results.csv",
  "modelInput": string,  # example: "/path/to/model.pkl",
  "modelOutput": string, # example: "/path/to/model.pkl",
  "customParameters": dict
}

Custom algorithm parameters

All algorithm hyper parameters described in the paper are exposed via the customParameters configuration option. This allows us to set those parameters from TimeEval.

Attention!

TimeEval does not parse a manifest.json file to get the custom parameters' types and default values. We expect the users of TimeEval to be familiar with the algorithms, so that they can specify the required parameters manually. However, we require each algorithm to be executable without specifying any custom parameters (especially for testing purposes). Therefore, please provide sensible default parameters for all custom parameters within the method's code.

Please add a manifest.json file to your algorithm anyway to aid the integration into UltraMine and for user information.

If your algorithm does not use the default parameters automatically and expects them to be provided, your algorithm will fail during runtime if no parameters are provided by the TimeEval user.

Input and output

Input and output for an algorithm is handled via bind-mounting files and folders into the Docker container.

All input data, such as the training dataset and the test dataset, are mounted read-only to the /data-folder of the container. The configuration options dataInput and modelInput reflect this with the correct path to the dataset (e.g. { "dataInput": "/data/dataset.test.csv" }).

All output of your algorithm should be written to the /results-folder. This is also reflected in the configuration options with the correct paths for dataOutput and modelOutput (e.g. { "dataOutput": "/results/anomaly_scores.csv" }). The /results-folder is also bind-mounted to the algorithm container - but writable -, so that TimeEval can access the results after your algorithm finished. An algorithm can also use this folder to write persistent log and debug information.

Temporary files and data of an algorithm are written to the current working directory (currently this is /app) or the temporary directory /tmp within the Docker container.

Example calls

The following Docker command represents the way how TimeEval executes your algorithm image:

docker run --rm \
    -v $(pwd)/1-data:/data:ro \
    -v $(pwd)/2-results:/results:rw \
#    -e LOCAL_UID=<current user id> \
#    -e LOCAL_GID=<groupid of akita group> \
#    <resource restrictions> \
  registry.gitlab.hpi.de/akita/i/<your_algorithm>:latest execute-algorithm '{ \
    "executionType": "train", \
    "dataInput": "/data/dataset.csv", \
    "modelInput": "/data/model.pkl", \
    "dataOutput": "/results/anomaly_scores.ts", \
    "modelOutput": "/results/model.pkl", \
    "customParameters": {} \
  }'

This is translated to the following call within the container:

mkdir results
docker run --rm \
    -v $(pwd)/1-data:/data:ro \
    -v $(pwd)/2-results:/results:rw \
  registry.gitlab.hpi.de/akita/i/<your_algorithm>:latest bash
# now, within the container
<python | java -jar | Rscript> $ALGORITHM_MAIN '{ \
  "executionType": "train", \
  "dataInput": "/data/dataset.csv", \
  "modelInput": "/data/model.pkl", \
  "dataOutput": "/results/anomaly_scores.ts", \
  "modelOutput": "/results/model.pkl", \
  "customParameters": {} \
}'