Skip to content

Cluster analysis for Salesforce: Group records into clusters and visualize the result using ML algorithms

License

Notifications You must be signed in to change notification settings

anablock/ClusterAnalysis

 
 

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

14 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

SFDC Cluster Analysis Package

Performs cluster analysis on Salesforce standard and custom objects, breaks records into groups (clusters) using K-Means and K-Medoids (CLARA) algorithms.
Supports clustering objects with mixed data types (numeric, category/picklist, text) using Gower distance function.
Visualizes the clustering result using t-SNE dimensionality reduction technique.
Click here to get more information about the methodology and algorithms used in this app.

Installation

Install the application from Salesforce AppExchange

Dev, Build and Test

Create a scratch org

sfdx force:org:create -f ./config/project-scratch-def.json -a ScratchOrgAlias --durationdays 30
sfdx force:config:set defaultusername=<user name returned from the previous command>

Push the source code to the scratch org

sfdx force:org:push

Create sample lead records

sfdx force:data:bulk:upsert -f force-app/main/default/staticresources/ClustanLeadsMock.csv -s Lead -i Email

Run Apex tests

sfdx force:apex:test:run

Authorise an org

sfdx force:auth:web:login --setalias OrgAlias

Deploy to an org

The app also works if deployed to an org without a namespace. However I recommend using a managed package installation

sfdx force:source:deploy --checkonly --sourcepath force-app --targetusername OrgAlias --testlevel RunLocalTests
sfdx force:source:deploy --sourcepath force-app --targetusername OrgAlias --testlevel NoTestRun

Create a managed package

sfdx force:package:create --name "Cluster Analysis" --path force-app --packagetype Managed -d "Group records from any object into clusters and visualize the result using machine learning algorithms"

Create and promote a package version

sfdx force:package:version:create --package "Cluster Analysis" --wait 10 --installationkeybypass --codecoverage
sfdx force:package:version:promote --package "Cluster [email protected]"

Description of Files and Directories

  • sfdx-project.json: Required by Salesforce DX. Configures your project. Use this file to specify the parameters that affect your Salesforce development project.
  • config/project-scratch-def.json: Sample file that shows how to define the shape of a scratch org. You reference this file when you create your scratch org with the force:org:create command.
  • force-app: Directory that contains the source for the Cluster Analysis package and tests.
  • force-app/main/default: Directory that contains the app source and shared classes.
  • force-app/main/algorithms: Directory that contains algorithm classes.
  • force-app/main/utils: Directory that contains utility classes.
  • force-app/main/test: Directory that contains Apex test classes.
  • .project: Required by the Eclipse IDE. Describes the Eclipse project.
  • .gitignore: Optional Git file. Specifies intentionally untracked files that you want Git (or in this case GitHub) to ignore.

Resources

Clustering Large Data Sets (By Leonard Kaufman, Peter J.Rousseeuw, 1986)

Clustering with optimised weights for Gower’s metric (By Jeroen van den Hoven) https://beta.vu.nl/nl/Images/stageverslag-hoven_tcm235-777817.pdf

Clustering on mixed type data (by Thomas Filaire) https://towardsdatascience.com/clustering-on-mixed-type-data-8bbd0a2569c3

Visualizing Data using t-SNE (by Laurens van der Maaten) https://lvdmaaten.github.io/tsne/

tSNEJS (Copyright Andrej Karpathy) https://github.com/karpathy/tsnejs

Javascript SOQL parser (Copyright 2019 Austin Turner) https://github.com/paustint/soql-parser-js

JavaScript Algorithms and Data Structures (Copyright (c) 2018 Oleksii Trekhleb) https://github.com/trekhleb/javascript-algorithms

Data-Driven Documents (D3.js, Copyright 2010-2017 Mike Bostock) https://d3js.org/

Building Machine Learning Systems with Apex (Presented on DF14 by Jen Wyher and Paul Battisson) https://www.slideshare.net/pbattisson/df14-building-machine-learning-systems-with-apex

Issues

To report an bug or suggest an enhancement create an issue on "Issues" tab.

About

Cluster analysis for Salesforce: Group records into clusters and visualize the result using ML algorithms

Resources

License

Stars

Watchers

Forks

Packages

No packages published

Languages

  • JavaScript 85.8%
  • Apex 14.0%
  • Other 0.2%