Function er_cluster generates clusters of embedding vectors using standard clustering algorithms.

er_cluster(
  embedding,
  method = "hclust",
  k = NULL,
  eps = NULL,
  metric = "arccos",
  ...,
  verbose = FALSE
)

Arguments

embedding

a numeric matrix containing a text embedding.

method

a character string specifying the clustering method One of c("hclust","dbscan","louvain"). Default is "hclust".

k

an integer specifying the number of clusters for method = "hclust".

eps

a numeric specifying the within-cluster point distance for method = "dbscan".

metric

a character string specifying the similarity function used for methods c("hclust","louvain").

...

further arguments passed on to the clustering methods. Can be used, e.g., to specify the linkage criterion in hierarchical clustering (see hclust), the minimum number of points in DBSCAN clustering (see dbscan), or the resolution in Louvain clustering (see cluster_louvain).

verbose

a logical specifying whether to show messages.'

Value

The function returns a matrix containing the input embedding, which has gained a new attribute "cluster".

References

Wulff, D. U., Aeschbach, S., Hussain, Z., & Mata, R. (2024). embeddeR. In preparation.

Examples

if (FALSE) {
# add clustering to embedding
embedding <- er_cluster(embedding)
}