Skip to content

k_means() creates K-Modes model. This model is intended to be used with categorical predictors. Although it will accept numeric predictors if they contain a few number of unique values. The numeric predictors will then be treated like categorical.

Details

For this engine, there is a single mode: partition

Tuning Parameters

This model has 1 tuning parameters:

  • num_clusters: # Clusters (type: integer, default: no default)

Translation from tidyclust to the original package (partition)

k_means(num_clusters = integer(1)) %>% 
  set_engine("klaR") %>% 
  set_mode("partition") %>% 
  translate_tidyclust()

## K Means Cluster Specification (partition)
## 
## Main Arguments:
##   num_clusters = integer(1)
## 
## Computational engine: klaR 
## 
## Model fit template:
## tidyclust::.k_means_fit_klaR(data = missing_arg(), modes = missing_arg(), 
##     modes = integer(1))

Preprocessing requirements

Only categorical variables are accepted, along with numerics with few unique values.

References

  • Huang, Z. (1997) A Fast Clustering Algorithm to Cluster Very Large Categorical Data Sets in Data Mining. in KDD: Techniques and Applications (H. Lu, H. Motoda and H. Luu, Eds.), pp. 21-34, World Scientific, Singapore.

  • MacQueen, J. (1967) Some methods for classification and analysis of multivariate observations. In Proceedings of the Fifth Berkeley Symposium on Mathematical Statistics and Probability, eds L. M. Le Cam & J. Neyman, 1, pp. 281-297. Berkeley, CA: University of California Press.