Skip to content
This repository has been archived by the owner on Aug 16, 2024. It is now read-only.

Commit

Permalink
set tolerance to 0 (Intel-bigdata#651)
Browse files Browse the repository at this point in the history
* 1. set tol to 0

Signed-off-by: minmingz <[email protected]>

* 1. remove convergedist

Signed-off-by: minmingz <[email protected]>

* 1. add code comment

Signed-off-by: minmingz <[email protected]>

Co-authored-by: minmingz <[email protected]>
  • Loading branch information
minmingzhu and minmingz authored Dec 2, 2020
1 parent fe3738e commit f55862a
Show file tree
Hide file tree
Showing 2 changed files with 1 addition and 7 deletions.
7 changes: 0 additions & 7 deletions conf/workloads/ml/kmeans.conf
Original file line number Diff line number Diff line change
Expand Up @@ -4,50 +4,43 @@ hibench.kmeans.tiny.num_of_samples 30000
hibench.kmeans.tiny.samples_per_inputfile 6000
hibench.kmeans.tiny.max_iteration 5
hibench.kmeans.tiny.k 10
hibench.kmeans.tiny.convergedist 0.5
hibench.kmeans.small.num_of_clusters 5
hibench.kmeans.small.dimensions 20
hibench.kmeans.small.num_of_samples 3000000
hibench.kmeans.small.samples_per_inputfile 600000
hibench.kmeans.small.max_iteration 5
hibench.kmeans.small.k 10
hibench.kmeans.small.convergedist 0.5
hibench.kmeans.large.num_of_clusters 5
hibench.kmeans.large.dimensions 20
hibench.kmeans.large.num_of_samples 20000000
hibench.kmeans.large.samples_per_inputfile 4000000
hibench.kmeans.large.max_iteration 5
hibench.kmeans.large.k 10
hibench.kmeans.large.convergedist 0.5
hibench.kmeans.huge.num_of_clusters 5
hibench.kmeans.huge.dimensions 20
hibench.kmeans.huge.num_of_samples 100000000
hibench.kmeans.huge.samples_per_inputfile 20000000
hibench.kmeans.huge.max_iteration 5
hibench.kmeans.huge.k 10
hibench.kmeans.huge.convergedist 0.5
hibench.kmeans.gigantic.num_of_clusters 5
hibench.kmeans.gigantic.dimensions 20
hibench.kmeans.gigantic.num_of_samples 200000000
hibench.kmeans.gigantic.samples_per_inputfile 40000000
hibench.kmeans.gigantic.max_iteration 5
hibench.kmeans.gigantic.k 10
hibench.kmeans.gigantic.convergedist 0.5
hibench.kmeans.bigdata.num_of_clusters 5
hibench.kmeans.bigdata.dimensions 20
hibench.kmeans.bigdata.num_of_samples 1200000000
hibench.kmeans.bigdata.samples_per_inputfile 40000000
hibench.kmeans.bigdata.max_iteration 10
hibench.kmeans.bigdata.k 10
hibench.kmeans.bigdata.convergedist 0.5

hibench.kmeans.num_of_clusters ${hibench.kmeans.${hibench.scale.profile}.num_of_clusters}
hibench.kmeans.dimensions ${hibench.kmeans.${hibench.scale.profile}.dimensions}
hibench.kmeans.num_of_samples ${hibench.kmeans.${hibench.scale.profile}.num_of_samples}
hibench.kmeans.samples_per_inputfile ${hibench.kmeans.${hibench.scale.profile}.samples_per_inputfile}
hibench.kmeans.max_iteration ${hibench.kmeans.${hibench.scale.profile}.max_iteration}
hibench.kmeans.k ${hibench.kmeans.${hibench.scale.profile}.k}
hibench.kmeans.convergedist ${hibench.kmeans.${hibench.scale.profile}.convergedist}
hibench.kmeans.base.hdfs ${hibench.hdfs.data.dir}/Kmeans
hibench.kmeans.input.sample ${hibench.workload.input}/samples
hibench.kmeans.input.cluster ${hibench.workload.input}/cluster
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -110,6 +110,7 @@ object DenseKMeans {
.setK(params.k)
.setMaxIter(params.numIterations)
.setSeed(1L)
.setTol(0) //set convergence to 0, aiming to execute the number of iterations of the algorithm without being affected by the convergence parameters.
.fit(examples)

val cost = model.summary.trainingCost
Expand Down

0 comments on commit f55862a

Please sign in to comment.