pvclust

The original algorithm is implemented in R by Suzuki and Shimodira (2006): Pvclust: an R package for assessing the uncertainty in hierarchical clustering. This is its Python reimplementation. The final values produced are Approximately Unbiased p-value (AU) and Bootstrap Probability (BP) which are reporting the significance of each cluster in clustering structure. The AU value is less biased and clusters that have this value greater than 95% are considered significant. Both values are calculated using Multiscale Bootstrap Resampling.

This implementation is part of the Master Thesis at the Faculty of Computer and Information Science, University of Ljubljana.

Example

Here, we will show exmple of usage of the Python implemention on the Boston Housing dataset.

import pandas as pd
from sklearn.datasets import load_boston
from pvclust import PvClust

if __name__ == "__main__":
    X, y = load_boston(return_X_y=True)
    X = pd.DataFrame(X)
    pv = PvClust(X, method="ward", metric="euclidean", nboot=1000)

While aglorithm is running we follow its stages.

To display the obtained dendrogram with p-values we call pv.plot().

To display result we call function print_result.

pv.print_result()

Furthermore, if we are interested in specific clusters or want to display values with certain decimal points we can call following:

pv.print_result(which=[2, 6], digits=5)

The standard errors of AU p-values can be displayed on a graph by calling function seplot.

pv.seplot()

We also implemented parallel version of this implementation which can run by setting the parallel=True. In this mode, the algorithm will deploy all the cores on the machine and speed up the calculation.

from sklearn.datasets import load_boston
from pvclust import PvClust

if __name__ == "__main__":
    X, y = load_boston(return_X_y=True)
    X = pd.DataFrame(X)
    pv = PvClust(X, method='ward', metric='euclidean', nboot=1000 , parallel=True)

Name		Name	Last commit message	Last commit date
Latest commit History 20 Commits
__pycache__		__pycache__
images		images
README.md		README.md
example.py		example.py
pvclust.py		pvclust.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

pvclust

Example

About

Uh oh!

Releases

Packages

Languages

robertsamples/pvclust

Folders and files

Latest commit

History

Repository files navigation

pvclust

Example

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages