Bolt is an algorithm for compressing vectors of real-valued data and running mathematical operations directly on the compressed representations.
If you have a large collection of mostly-dense vectors and can tolerate lossy compression, Bolt can probably save you 10-200x space and compute time.
Bolt also has theoretical guarantees bounding the errors in its approximations.
$ brew install swig # for wrapping C++; use apt-get, yum, etc, if not OS X
$ pip install numpy # bolt installation needs numpy already present
$ git clone https://github.com/dblalock/bolt.git
$ cd bolt && python setup.py install
$ pytest tests/ # optionally, run the tests
If you run into any problems, please don't hesitate to mention it in the Python build problems issue.
Install Bazel, Google's open-source build system. Then
$ git clone https://github.com/dblalock/bolt.git
$ cd bolt/cpp && bazel run :main
The bazel run
command will build the project and run the tests and benchmarks.
If you want to integrate Bolt with another C++ project, include cpp/src/include/public.hpp
and add the remaining files under cpp/src
to your builds. You should let me know if you're interested in doing such an integration because I'm hoping to see Bolt become part of many libraries and thus would be happy to help you.
Bolt currently only supports machines with AVX2 instructions, which basically means x86 machines from fall 2013 or later. Contributions for ARM support are welcome. Also note that the Bolt Python wrapper is currently configured to require Clang, since GCC apparently runs into issues.
Bolt is based on vector quantization. For details, see the Bolt paper or slides.
Bolt includes a thorough set of speed and accuracy benchmarks. See the experiments/
directory. This is also what you want if you want to reproduce the results in the paper.
Note that all of the timing results use the raw C++ implementation. At present, the Python wrapper is slightly slower due to Python overhead. If you're interested in having a full-speed wrapper, let me know and I'll allocate time to making this happen.
X, queries = some N x D array, some iterable of length D arrays
# these are approximately equal (though the latter are shifted and scaled)
enc = bolt.Encoder(reduction='dot').fit(X)
[np.dot(X, q) for q in queries]
[enc.transform(q) for q in queries]
# same for these
enc = bolt.Encoder(reduction='l2').fit(X)
[np.sum((X - q) * (X - q), axis=1) for q in queries]
[enc.transform(q) for q in queries]
# but enc.transform() is 10x faster or more
import bolt
import numpy as np
from scipy.stats import pearsonr as corr
from sklearn.datasets import load_digits
import timeit
# for simplicity, use the sklearn digits dataset; we'll split
# it into a matrix X and a set of queries Q
X, _ = load_digits(return_X_y=True)
nqueries = 20
X, Q = X[:-nqueries], X[-nqueries:]
enc = bolt.Encoder(reduction='dot', accuracy='lowest') # can tweak acc vs speed
enc.fit(X)
dot_corrs = np.empty(nqueries)
for i, q in enumerate(Q):
dots_true = np.dot(X, q)
dots_bolt = enc.transform(q)
dot_corrs[i] = corr(dots_true, dots_bolt)[0]
# dot products closely preserved despite compression
print "dot product correlation: {} +/- {}".format(
np.mean(dot_corrs), np.std(dot_corrs)) # > .97
# massive space savings
print(X.nbytes) # 1777 rows * 64 cols * 8B = 909KB
print(enc.nbytes) # 1777 * 2B = 3.55KB
# massive time savings (~10x here, but often >100x on larger
# datasets with less Python overhead; see the paper)
t_np = timeit.Timer(
lambda: [np.dot(X, q) for q in Q]).timeit(5) # ~9ms
t_bolt = timeit.Timer(
lambda: [enc.transform(q) for q in Q]).timeit(5) # ~800us
print "Numpy / BLAS time, Bolt time: {:.3f}ms, {:.3f}ms".format(
t_np * 1000, t_bolt * 1000)
# can get output without offset/scaling if needed
dots_bolt = [enc.transform(q, unquantize=True) for q in Q]
# search using squared Euclidean distances
# (still using the Digits dataset from above)
enc = bolt.Encoder('l2', accuracy='high').fit(X)
bolt_knn = [enc.knn(q, k_bolt) for q in Q] # knn for each query
# search using dot product (maximum inner product search)
enc = bolt.Encoder('dot', accuracy='medium').fit(X)
bolt_knn = [enc.knn(q, k_bolt) for q in Q] # knn for each query
Bolt stands for "Based On Lookup Tables". Feel free to use this exciting fact at parties.