KDTrees

Kd trees for Julia.

Currently supports KNN-search and finding all points inside an hyper sphere centered at a given point

Care has been taken with regards to performance. The tree is for example not naively implemented as nodes pointing to other nodes but instead as a collection of densely packed arrays. This gives better cache locality. This however means that the tree is immutable and new points can not be entered into the tree after it has been created.

There are some benchmarks for the creation of the tree and the different searches in the benchmark folder.

Author

Kristoffer Carlsson (@KristofferC)

Examples

Creating the tree

The tree is created with the command:

using KDTrees
data = rand(3,10^3)
tree = KDTree(data, leaf_size=5)

The data argument for the tree should be a matrix of floats of dimension (n_dim, n_points). The leaf_size determines for what number of points the tree should stop splitting. 5 is a good number and is also the default value.

Points inside hyper sphere

The exported query_ball_point(tree, point, radius) finds all points inside a hyper sphere centered at a given point with the given radius. The function returns a sorted list of the indices of the points in the sphere.

using KDTrees
tree = KDTree(randn(3, 1000))
query_ball_point(tree, [0.0, 0.0, 0.0], 0.4)

gives the indices:

8-element Array{Int64,1}:
 184
 199
 307
 586
 646
 680
 849
 906
 926

K-Nearest-Neighbours

The exported function k_nearest_neighbour(tree, point, k) finds the k nearest neighbours to a given point. The function returns a tuple of two lists with the indices and the distances from the given points respectively. These are sorted in the order of smallest to largest distance.

using KDTrees
tree = KDTree(randn(3, 1000))
k_nearest_neighbour(tree, [0.0, 0.0, 0.0], 5)

gives both the indices and distances:

([300,119,206,180,845],[0.052019,0.200885,0.220441,0.22447,0.235882])

Benchmarks

The benchmarks have been made with computer with a 4 core Intel i5-2500K @ 3.3 GHz with Julia v0.4.0-dev+3034.

Clicking on a plot takes you to the Plotly site for the plot where the exact data can be seen.

KNN benchmark

Build time benchmark

TODOs

Add proper benchmarks, compare with others implementations. Update: Partly done
Add other measures than Euclidean distance.
Use a heap for storing the K best points in KNN instead of a linear array (should only matter for large K).

Contribution

Contributions are more than welcome. If you have an idea that would make the tree have better performance or be more general please create a PR. Make sure you run the benchmarks before and after your changes.

Name		Name	Last commit message	Last commit date
Latest commit History 115 Commits
benchmark		benchmark
src		src
test		test
.gitignore		.gitignore
.travis.yml		.travis.yml
LICENSE.md		LICENSE.md
README.md		README.md
REQUIRE		REQUIRE

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

KDTrees

Author

Examples

Creating the tree

Points inside hyper sphere

K-Nearest-Neighbours

Benchmarks

KNN benchmark

Build time benchmark

TODOs

Contribution

About

Releases

Packages

Languages

License

maxlieblich/KDTrees.jl

Folders and files

Latest commit

History

Repository files navigation

KDTrees

Author

Examples

Creating the tree

Points inside hyper sphere

K-Nearest-Neighbours

Benchmarks

KNN benchmark

Build time benchmark

TODOs

Contribution

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages