Skip to content

Specifications

These cluster specification functions are used to specify the type of model you want to do. These functions work in a similar fashion to the model specification function from parsnip.

k_means()
K-Means
hier_clust()
Hierarchical (Agglomerative) Clustering
db_clust()
Density-Based Spatial Clustering of Applications with Noise (DBSCAN)
gm_clust()
Gaussian Mixture Models (GMM)
mean_shift()
Mean Shift Clustering
cluster_spec
Model Specification Information
cluster_fit
Model Fit Object Information

Fit and Inspect

These functions are the generics that are supported for specifications created with tidyclust.

fit(<cluster_spec>) fit_xy(<cluster_spec>)
Fit a Model Specification to a Data Set
set_args(<cluster_spec>)
Change arguments of a cluster specification
set_engine(<cluster_spec>)
Change engine of a cluster specification
set_mode(<cluster_spec>)
Change mode of a cluster specification
augment(<cluster_fit>)
Augment data with predictions
glance(<cluster_fit>)
Construct a single row summary "glance" of a model, fit, or other object
tidy(<cluster_fit>)
Turn a tidyclust model object into a tidy tibble
extract_fit_engine(<cluster_fit>) extract_parameter_set_dials(<cluster_spec>)
Extract elements of a tidyclust model object
axe_call.cluster_fit() axe_ctrl.cluster_fit() axe_data.cluster_fit() axe_env.cluster_fit() axe_fitted.cluster_fit()
Axing a cluster_fit.

Prediction

Once the cluster specification have been fit, you are likely to want to look at where the clusters are and which observations are associated with which cluster.

predict(<cluster_fit>) predict_raw(<cluster_fit>)
Model predictions
extract_cluster_assignment()
Extract cluster assignments from model
extract_centroids()
Extract clusters from model

Model based performance metrics

These metrics use the fitted clustering model to extract values denoting how well the model works.

cluster_metric_set()
Combine metric functions
silhouette_avg() silhouette_avg_vec()
Measures average silhouette across all observations
sse_ratio() sse_ratio_vec()
Compute the ratio of the WSS to the total SSE
sse_total() sse_total_vec()
Compute the total sum of squares
sse_within_total() sse_within_total_vec()
Compute the sum of within-cluster SSE
silhouette()
Measures silhouette between clusters
sse_within()
Calculates Sum of Squared Error in each cluster

Tuning

Functions to allow multiple cluster specifications to be fit at once.

control_cluster() print(<control_cluster>)
Control the fit function
update(<db_clust>) update(<gm_clust>) update(<hier_clust>) update(<k_means>) update(<mean_shift>)
Update a cluster specification
finalize_model_tidyclust() finalize_workflow_tidyclust() deprecated
Splice final parameters into objects
tune_cluster()
Model tuning via grid search

Tuning Objects

Dials objects.

bandwidth()
Bandwidth
cut_height()
Cut Height
circular() zero_covariance() shared_orientation() shared_shape() shared_size()
Gaussian mixture covariance structure parameters
linkage_method() values_linkage_method
The agglomeration Linkage method
min_points()
Minimum number of points
radius()
Radius

Developer tools

contr_one_hot()
One-hot contrast matrix
extract_fit_summary()
S3 method to get fitted model summary info depending on engine
get_centroid_dists()
Computes distance from observations to centroids
new_cluster_metric()
Construct a new clustering metric function
prep_data_dist()
Prepares data and distance matrices for metric calculation
reconcile_clusterings_mapping()
Relabels clusters to match another cluster assignment
translate_tidyclust()
Resolve a Model Specification for a Computational Engine
min_grid(<cluster_spec>)
Determine the minimum set of model fits