Skip to content
forked from pnnl/DDKS

A high-dimensional Kolmogorov-Smirnov distance for comparing high dimensional distributions

License

Notifications You must be signed in to change notification settings

patrick-pagni/DDKS

 
 

Repository files navigation

ddKS - a d-dimensional Kolmogorov-Smirnov Test

This is a fork of the pnnl/ddks repo. The intention of the fork is to add a class which compares an empirical CDF and an analytic CDF. It is in the adKS file and is largely an extension of the DDKS file.

Thanks to Hagen et al. for development of the original ddKS repository which inspired this change.

ReadMe

Alex Hagen1, Shane Jackson1, James Kahn2, Jan Strube1, Isabel Haide2, Karl Pazdernik1, and Connor Hainje1

1: Pacific Northwest National Laboratory, 2: Karlsruhe Institute of Technology

This code accompanies our paper submitted to IEEE Transactions on Pattern Analysis and Machine Intelligence titled "Accelerated Computation of a High Dimensional Kolmogorov-Smirnov Distance" (arXiv).

As of 6/25/2021 there are 3 methods implemented:

  • ddKS - d-dimensional KS test caclulated per
    • Variable splitting of space (all points, subsample, grid spacing)
  • rdKS - ddKS approximation using distance from (d+1) corners
  • vdKS - ddKS approximation calculating ddks distance between voxels instead of points

Quickstart

Installation of ddks should be pretty easy, simple run

pip install git+https://github.com/patrick-pagni/DDKS

or, if you want to develop on DDKS, simply clone this repository into a safe spot on your computer and run

pip install -e .

from the top level of the repository.

Then, you can get started used the repository by starting a ddks object and performing the distance calculation on any pair of torch tensors that are sample_size x dimension.

import torch
import ddks

p = torch.rand((100, 3))
t = torch.rand((50, 3))

calculation = ddks.methods.ddKS()
distance = calculation(p, t)
print(f"The ddKS distance is {distance}")

To operate on GPU, all you need to do is move the tensors to the device before calculation:

p = torch.rand((100, 3)).to('cuda:0')
t = torch.rand((50, 3)).to('cuda:0')

calculation = ddks.methods.ddKS()
distance = calculation(p, t)

If you want to use a different accelerated method, simply use ddks.methods.rdKS or ddks.methods.vdKS. Note that rdKS and vdKS cannot use GPU.

Package Structure:

  1. methods - Callable classes for xdks methods [x=d,r,v]
  2. data - Contains several data generators to play around with
  3. run_scripts - Contains an example run script
  4. Unit_tests - Contains unit tests for repo

About

A high-dimensional Kolmogorov-Smirnov distance for comparing high dimensional distributions

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • Jupyter Notebook 90.4%
  • Python 9.6%