This repo contains code used in the experiments in the paper
Osman Asif Malik. More Efficient Sampling for Tensor Decomposition With Worst-Case Guarantees. International Conference on Machine Learning (ICML), PMLR 162, pp. 14887-14917, 2022.
The paper is available at https://proceedings.mlr.press/v162/malik22a.html.
The perhaps somewhat cryptic repo name stands for "Tensor Decomposition via Alternating Least Squares with Efficient Sampling."
If you use this code in any of your own work, please reference our paper:
@InProceedings{pmlr-v162-malik22a,
title = {More Efficient Sampling for Tensor Decomposition With Worst-Case Guarantees},
author = {Malik, Osman Asif},
booktitle = {Proceedings of the 39th International Conference on Machine Learning},
pages = {14887--14917},
year = {2022},
volume = {162},
series = {Proceedings of Machine Learning Research},
month = {17--23 Jul},
publisher = {PMLR},
}
I have done my best to include relevant references in the code and this README, so please also cite those works as appropriate.
The functions cp_als_es.m
and tr_als_es.m
are implementations of the proposed methods for CP and tensor ring decomposition, respectively.
The recursive sketching and sampling procedures are implemented in recursive_sketch_CP.m
and draw_samples_CP.m
for the CP decomposition, and in recursive_sketch_TR.m
and draw_samples_TR.m
for the tensor ring decomposition.
The experiments that appear in the paper were conducted using the following scripts:
-
experiment_1.m
andexperiment_2.m
: Were used to generate the results in Sections 5.1 (Tables 3 and 4) and Section D.3 (Tables 6 and 7) for the CP and tensor ring decompositions, respectively. Portions of these scripts were also used to create the plots in Section D.3 (Figures 2-5). -
experiment_2b_TT.m
: Repeats the experiment inexperiment_2.m
but for a setup corresponding to the tensor train decomposition. It was used to generate some of the results in Section D.6 (Table 8). -
experiment_3.m
: Was used to run the feature extraction experiment in Section 5.2 (Table 5) and Section D.6 (Table 9). -
experiment_worst_case_CP.m
: Used for the demonstration of improved complexity in Section 5.3 for the CP decomposition. -
experiment_worst_case_TR.m
: Used for the demonstration of improved complexity in Section D.5 for the tensor ring decomposition.
Results from the following scripts were not reported in the paper, but were used to gain some initial insight into different approaces to leverage score estimation:
-
test_1.m
: Compares for a CP decomposition least squares problem the exact leverage score sampling distribution, our estimated distribution, and the empirical distribution when sampling according to the estimated distribution using our sampling scheme. -
test_2.m
: The same astest_1.m
but for the tensor ring decomposition.
This code requires Tensor Toolbox for Matlab. It also requires some functionality from the repo tr-als-sampled. Make sure that the functions in both are accessible before running the code in this repo. Some of the methods we compare our CP decomposition method against in Section 5.2 require Tensorlab. Tensorlab is not necessary to run the rest of the code however.
The .c
files in this repo need to be compiled.
This is easy to do (provided that the appropriate compilers are installed) via the following commands in Matlab:
>> mex countSketch.c
>> mex mt19937ar.c
- Mersenne Twister C code.
This code is available for download here and was authored by Takuji Nishimura and Makoto Matsumoto.
The code, including the relevant license statement, is in the file
mt19937ar.c
. We modified this code so that it can be called from within Matlab via the MEX interface. - Sketching codes.
The files
countSketch.c
andTensorSketch.m
were copied over from the countsketch-matrix-tensor-id repo for convenience.
Please feel free to contact me at any time if you have any questions or would like to provide feedback on this code or on the paper. I can be reached at oamalik (at) lbl (dot) gov
or at osman (dot) malik (at) colorado (dot) edu
.