-
Notifications
You must be signed in to change notification settings - Fork 0
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
- Loading branch information
1 parent
c82ce6b
commit e960ccb
Showing
5 changed files
with
211 additions
and
1 deletion.
There are no files selected for viewing
27 changes: 27 additions & 0 deletions
27
matlab/external_functions/waynezhanghk-gactoolbox-53508ce/LICENSE
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,27 @@ | ||
Copyright (c) 2013, waynezhanghk | ||
All rights reserved. | ||
|
||
Redistribution and use in source and binary forms, with or without modification, | ||
are permitted provided that the following conditions are met: | ||
|
||
Redistributions of source code must retain the above copyright notice, this | ||
list of conditions and the following disclaimer. | ||
|
||
Redistributions in binary form must reproduce the above copyright notice, this | ||
list of conditions and the following disclaimer in the documentation and/or | ||
other materials provided with the distribution. | ||
|
||
Neither the name of the {organization} nor the names of its | ||
contributors may be used to endorse or promote products derived from | ||
this software without specific prior written permission. | ||
|
||
THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" AND | ||
ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED | ||
WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE | ||
DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT HOLDER OR CONTRIBUTORS BE LIABLE FOR | ||
ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES | ||
(INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; | ||
LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON | ||
ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT | ||
(INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS | ||
SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. |
74 changes: 74 additions & 0 deletions
74
matlab/external_functions/waynezhanghk-gactoolbox-53508ce/README.md
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,74 @@ | ||
GACToolbox | ||
========== | ||
|
||
Graph Agglomerative Clustering (GAC) toolbox | ||
|
||
Introduction | ||
------------ | ||
|
||
Gactoolbox is a summary of our research of agglomerative clustering on a graph. Agglomerative clustering, which iteratively merges small clusters, is commonly used for clustering because it is conceptually simple and produces a hierarchy of clusters. Classifical aggolomerative clustering algorithms, such as average linkage and DBSCAN, were widely used in many areas. Those algorithms, however, are not designed for clustering on a graph. This toolbox implements the following algorithms for agglomerative clustering on a directly graph. | ||
|
||
1. Structural descriptor based algorithms (`gacCluster.m`). We define a cluster descriptor based on the graph structure, and each merging is determined by maximizes the increment of the descriptor. Two descriptors, including zeta function and path integral, are implemented. You can also design new descriptor (creating functions similar to `gacPathEntropy.m` and `gacPathCondEntropy.m`) and develop new algorithms with our code. | ||
|
||
2. Graph degree linkage (`gdlCluster.m`). It is a simple and effective algorithm, with better performance than normalized cuts and spectral clustering, and is faster. | ||
|
||
This toolbox is written and maintained by Wei Zhang (`wzhang009 at gmail.com`). | ||
Please send me an email if you find any bugs or have any suggestions. | ||
|
||
Examples | ||
-------- | ||
Preparations: | ||
|
||
1. Compile mex functions | ||
2. Add 'gacfiles' and 'gdlfiles' to your matlab paths | ||
3. Calculate a pairwise distance matrix from your data | ||
|
||
```matlab | ||
K = 20; | ||
a = 1; | ||
z = 0.01; | ||
% path integral | ||
clusteredLabels = gacCluster (distance_matrix, groupNumber, 'path', K, a, z); | ||
% zeta function | ||
clusteredLabels = gacCluster (distance_matrix, groupNumber, 'zeta', K, a, z); | ||
% GDL-U algorithm | ||
clusteredLabels = gdlCluster(distance_matrix, groupNumber, K, a, false); | ||
% AGDL algorithm | ||
clusteredLabels = gdlCluster(distance_matrix, groupNumber, K, a, true); | ||
``` | ||
|
||
Citations | ||
--------- | ||
|
||
Please cite the following papers, if you find the code is helpful. | ||
|
||
* W. Zhang, D. Zhao, and X. Wang. | ||
Agglomerative clustering via maximum incremental path integral. | ||
Pattern Recognition, 46 (11): 3056-3065, 2013. | ||
|
||
* W. Zhang, X. Wang, D. Zhao, and X. Tang. | ||
Graph Degree Linkage: Agglomerative Clustering on a Directed Graph. | ||
in Proceedings of European Conference on Computer Vision (ECCV), 2012. | ||
|
||
Additional Notes | ||
---------------- | ||
|
||
1. How to compile mex files? | ||
|
||
I include mexw64 files. If you use a system other than win64, you can find a file called compileMex.m to help you build the mex files. | ||
|
||
2. We provide MATLAB implementation of structural descriptor based clustering and MATLAB-C++ mixed implementation of graph degree linkage. The MATLAB implementation is for ease of understanding, although it's inefficient. In the future we will add MATLAB implementation of graph degree linkage. | ||
|
||
In speed: AGDL > GDL-U > path integral > zeta function | ||
|
||
3. GDL-U and AGDL have similar performance. GDL-U is for small datasets and AGDL is for large datasets. | ||
|
||
AGDL has an additional parameter Kc in gdlMergingKNN_c.m. The larger Kc is, the closer performance AGDL has to GDL-U and slower the algorithm is. Default Kc = 10 is a good trade-off for most datasets. |
13 changes: 12 additions & 1 deletion
13
matlab/external_functions/waynezhanghk-gactoolbox-53508ce/compileMexFiles.m
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1 +1,12 @@ | ||
|
||
cd ./gdlfiles/ | ||
mex -O gacLlinks_c.cpp | ||
mex -O gacOnelink_c.cpp | ||
mex -O gacPartial_sort.cpp | ||
mex -O gacPartialMin_knn_c.cpp | ||
mex -O gacPartialMin_triu_c.cpp | ||
mex -O gdlInitAffinityTable_c.cpp gdlComputeAffinity.cpp | ||
mex -O gdlInitAffinityTable_knn_c.cpp gdlComputeAffinity.cpp | ||
mex -O gdlAffinity_c.cpp gdlComputeAffinity.cpp | ||
mex -O gdlDirectedAffinity_c.cpp gdlComputeDirectedAffinity.cpp | ||
mex -O gdlDirectedAffinity_batch_c.cpp gdlComputeDirectedAffinity.cpp | ||
cd ../ |
49 changes: 49 additions & 0 deletions
49
matlab/external_functions/waynezhanghk-gactoolbox-53508ce/gacCluster.m
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,49 @@ | ||
function clusteredLabels = gacCluster (distance_matrix, groupNumber, strDescr, K, a, z) | ||
%% Graph Agglomerative Clustering toolbox | ||
% Input: | ||
% - distance_matrix: pairwise distances, d_{i -> j} | ||
% - groupNumber: the final number of clusters | ||
% - strDescr: structural descriptor. The choice can be | ||
% - 'zeta': zeta function based descriptor | ||
% - 'path': path integral based descriptor | ||
% - K: the number of nearest neighbors for KNN graph, default: 20 | ||
% - p: merging (p+1)-links in l-links algorithm, default: 1 | ||
% - a: for covariance estimation, default: 1 | ||
% sigma^2 = (\sum_{i=1}^n \sum_{j \in N_i^K} d_{ij}^2) * a | ||
% - z: (I - z*P), default: 0.01 | ||
% Output: | ||
% - clusteredLabels: clustering results | ||
% by Wei Zhang (wzhang009 at gmail.com), June, 8, 2011 | ||
% | ||
% Please cite the following papers, if you find the code is helpful | ||
% | ||
% W. Zhang, D. Zhao, and X. Wang. | ||
% Agglomerative clustering via maximum incremental path integral. | ||
% Pattern Recognition, 46 (11): 3056-3065, 2013. | ||
% | ||
% W. Zhang, X. Wang, D. Zhao, and X. Tang. | ||
% Graph Degree Linkage: Agglomerative Clustering on a Directed Graph. | ||
% in Proceedings of European Conference on Computer Vision (ECCV), 2012. | ||
|
||
%% parse inputs | ||
disp('--------------- Graph Structural Agglomerative Clustering ---------------------'); | ||
|
||
if nargin < 2, error('GAC: input arguments are not enough!'); end | ||
if nargin < 3, strDescr = 'path'; end | ||
if nargin < 4, K = 20; end | ||
if nargin < 5, a = 1; end | ||
if nargin < 6, z = 0.01; end | ||
|
||
%% initialization | ||
|
||
disp('---------- Building graph and forming initial clusters with l-links ---------'); | ||
[graphW, NNIndex] = gacBuildDigraph(distance_matrix, K, a); | ||
% from adjacency matrix to probability transition matrix | ||
graphW = bsxfun(@times, 1./sum(graphW,2), graphW); % row sum is 1 | ||
initialClusters = gacNNMerge(distance_matrix, NNIndex); | ||
clear distance_matrix NNIndex | ||
|
||
disp('-------------------------- Zeta merging --------------------------'); | ||
clusteredLabels = gacMerging(graphW, initialClusters, groupNumber, strDescr, z); | ||
|
||
end |
49 changes: 49 additions & 0 deletions
49
matlab/external_functions/waynezhanghk-gactoolbox-53508ce/gdlCluster.m
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,49 @@ | ||
function clusteredLabels = gdlCluster (distance_matrix, groupNumber, K, a, usingKcCluster, p) | ||
%% Graph Agglomerative Clustering toolbox | ||
% Input: | ||
% - distance_matrix: pairwise distances, d_{i -> j} | ||
% - groupNumber: the final number of clusters | ||
% - strDescr: structural descriptor. The choice can be | ||
% - 'gdl': graph degree linkage algorithm | ||
% - others to be added | ||
% - K: the number of nearest neighbors for KNN graph, default: 20 | ||
% - p: merging (p+1)-links in l-links algorithm, default: 1 | ||
% - a: for covariance estimation, default: 1 | ||
% sigma^2 = (\sum_{i=1}^n \sum_{j \in N_i^3} d_{ij}^2) * a | ||
% Output: | ||
% - clusteredLabels: clustering results | ||
% by Wei Zhang (wzhang009 at gmail.com), June, 8, 2011 | ||
% | ||
% Please cite the following papers, if you find the code is helpful | ||
% | ||
% W. Zhang, X. Wang, D. Zhao, and X. Tang. | ||
% Graph Degree Linkage: Agglomerative Clustering on a Directed Graph. | ||
% in Proceedings of European Conference on Computer Vision (ECCV), 2012. | ||
% | ||
% W. Zhang, D. Zhao, and X. Wang. | ||
% Agglomerative clustering via maximum incremental path integral. | ||
% Pattern Recognition, 46 (11): 3056-3065, 2013. | ||
|
||
%% parse inputs | ||
disp('--------------- Graph Agglomerative Clustering ---------------------'); | ||
|
||
if nargin < 2, error('GAC: input arguments are not enough!'); end | ||
if nargin < 3, K = 20; end | ||
if nargin < 4, a = 1; end | ||
if nargin < 5, usingKcCluster = true; end | ||
if nargin < 6, p = 1; end | ||
|
||
%% initialization | ||
disp('---------- Building graph and forming initial clusters with l-links ---------'); | ||
[graphW, NNIndex] = gacBuildDigraph_c(distance_matrix, K, a); | ||
initialClusters = gacBuildLlinks_cwarpper(distance_matrix, p, NNIndex); | ||
clear distance_matrix NNIndex | ||
|
||
disp('-------------------------- Zeta merging --------------------------'); | ||
if usingKcCluster | ||
clusteredLabels = gdlMergingKNN_c(graphW, initialClusters, groupNumber); | ||
else | ||
clusteredLabels = gdlMerging_c(graphW, initialClusters, groupNumber); | ||
end | ||
|
||
end |