GitHub - msha096/bisecting_K_means: Implemented bisecting K-means in Python, with the feature selection. Gradually reduce the feature dimension when the cluster size is smaller.

The feature selection based bisecting K-means.

Implemented bisecting K-means in Python, with the feature selection. Gradually reduce the feature dimension when the cluster size is smaller.

Feature Selection:

The feature selection is done by applying PCA to the features and reduce the dimensionality of features gradually. The dimension is positive correlated with the clsuter size.

Pipeline:

The baselien K-Means is from SKLearn. The bisecting K-means is a top-down clustering model, it starts with all in one cluster. Each time we apply K-Means to the cluster with the largest square distance, with k = 2.

Evaluation:

The silhouette scores analysis is printed at each time K-Means divide the cluster into two sub clusters.

Usage:

Simply change the file path in main, and it will read your feature.

Name		Name	Last commit message	Last commit date
Latest commit History 6 Commits
README.md		README.md
model.py		model.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

The feature selection based bisecting K-means.

Feature Selection:

Pipeline:

Evaluation:

Usage:

About

Releases

Packages

Languages

msha096/bisecting_K_means

Folders and files

Latest commit

History

Repository files navigation

The feature selection based bisecting K-means.

Feature Selection:

Pipeline:

Evaluation:

Usage:

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages