Performs cluster analysis on Salesforce standard and custom objects, breaks records into groups (clusters) using K-Means and K-Medoids (CLARA) algorithms.
Supports clustering objects with mixed data types (numeric, category/picklist, text) using Gower distance function.
Visualizes the clustering result using t-SNE dimensionality reduction technique.
Click here to get more information about the methodology and algorithms used in this app.
Install the application from Salesforce AppExchange
sfdx force:org:create -f ./config/project-scratch-def.json -a ScratchOrgAlias --durationdays 30
sfdx force:config:set defaultusername=<user name returned from the previous command>
sfdx force:org:push
sfdx force:data:bulk:upsert -f force-app/main/default/staticresources/ClustanLeadsMock.csv -s Lead -i Email
sfdx force:apex:test:run
sfdx force:auth:web:login --setalias OrgAlias
The app also works if deployed to an org without a namespace. However I recommend using a managed package installation
sfdx force:source:deploy --checkonly --sourcepath force-app --targetusername OrgAlias --testlevel RunLocalTests
sfdx force:source:deploy --sourcepath force-app --targetusername OrgAlias --testlevel NoTestRun
sfdx force:package:create --name "Cluster Analysis" --path force-app --packagetype Managed -d "Group records from any object into clusters and visualize the result using machine learning algorithms"
sfdx force:package:version:create --package "Cluster Analysis" --wait 10 --installationkeybypass --codecoverage
sfdx force:package:version:promote --package "Cluster [email protected]"
- sfdx-project.json: Required by Salesforce DX. Configures your project. Use this file to specify the parameters that affect your Salesforce development project.
- config/project-scratch-def.json: Sample file that shows how to define the shape of a scratch org. You reference this file when you create your scratch org with the force:org:create command.
- force-app: Directory that contains the source for the Cluster Analysis package and tests.
- force-app/main/default: Directory that contains the app source and shared classes.
- force-app/main/algorithms: Directory that contains algorithm classes.
- force-app/main/utils: Directory that contains utility classes.
- force-app/main/test: Directory that contains Apex test classes.
- .project: Required by the Eclipse IDE. Describes the Eclipse project.
- .gitignore: Optional Git file. Specifies intentionally untracked files that you want Git (or in this case GitHub) to ignore.
Clustering Large Data Sets (By Leonard Kaufman, Peter J.Rousseeuw, 1986)
Clustering with optimised weights for Gower’s metric (By Jeroen van den Hoven) https://beta.vu.nl/nl/Images/stageverslag-hoven_tcm235-777817.pdf
Clustering on mixed type data (by Thomas Filaire) https://towardsdatascience.com/clustering-on-mixed-type-data-8bbd0a2569c3
Visualizing Data using t-SNE (by Laurens van der Maaten) https://lvdmaaten.github.io/tsne/
tSNEJS (Copyright Andrej Karpathy) https://github.com/karpathy/tsnejs
Javascript SOQL parser (Copyright 2019 Austin Turner) https://github.com/paustint/soql-parser-js
JavaScript Algorithms and Data Structures (Copyright (c) 2018 Oleksii Trekhleb) https://github.com/trekhleb/javascript-algorithms
Data-Driven Documents (D3.js, Copyright 2010-2017 Mike Bostock) https://d3js.org/
Building Machine Learning Systems with Apex (Presented on DF14 by Jen Wyher and Paul Battisson) https://www.slideshare.net/pbattisson/df14-building-machine-learning-systems-with-apex
To report an bug or suggest an enhancement create an issue on "Issues" tab.