This is a fork of the pnnl/ddks repo. The intention of the fork is to add a class which compares an empirical CDF and an analytic CDF. It is in the adKS file and is largely an extension of the DDKS file.
Thanks to Hagen et al. for development of the original ddKS repository which inspired this change.
Alex Hagen1, Shane Jackson1, James Kahn2, Jan Strube1, Isabel Haide2, Karl Pazdernik1, and Connor Hainje1
1: Pacific Northwest National Laboratory, 2: Karlsruhe Institute of Technology
This code accompanies our paper submitted to IEEE Transactions on Pattern Analysis and Machine Intelligence titled "Accelerated Computation of a High Dimensional Kolmogorov-Smirnov Distance" (arXiv).
As of 6/25/2021 there are 3 methods implemented:
- ddKS - d-dimensional KS test caclulated per
- Variable splitting of space (all points, subsample, grid spacing)
- rdKS - ddKS approximation using distance from (d+1) corners
- vdKS - ddKS approximation calculating ddks distance between voxels instead of points
Installation of ddks
should be pretty easy, simple run
pip install git+https://github.com/patrick-pagni/DDKS
or, if you want to develop on DDKS, simply clone this repository into a safe spot on your computer and run
pip install -e .
from the top level of the repository.
Then, you can get started used the
repository by starting a ddks
object and performing the distance calculation
on any pair of torch tensors that are sample_size
x dimension
.
import torch
import ddks
p = torch.rand((100, 3))
t = torch.rand((50, 3))
calculation = ddks.methods.ddKS()
distance = calculation(p, t)
print(f"The ddKS distance is {distance}")
To operate on GPU, all you need to do is move the tensors to the device before calculation:
p = torch.rand((100, 3)).to('cuda:0')
t = torch.rand((50, 3)).to('cuda:0')
calculation = ddks.methods.ddKS()
distance = calculation(p, t)
If you want to use a different accelerated method, simply use
ddks.methods.rdKS
or ddks.methods.vdKS
. Note that rdKS and vdKS cannot use
GPU.
- methods - Callable classes for xdks methods [x=d,r,v]
- data - Contains several data generators to play around with
- run_scripts - Contains an example run script
- Unit_tests - Contains unit tests for repo