Run PaCMAP dimensionality reduction

pacmap(
  embedding,
  n_components = 2,
  n_neighbors = 10,
  MN_ratio = 0.5,
  FP_ratio = 2,
  distance = "euclidean",
  lr = 1,
  num_iters = 450,
  verbose = FALSE,
  apply_pca = TRUE
)

Arguments

embedding

a numeric matrix containing a text embedding.

n_components

an integer Dimensions of the embedded space. Default is 2.

n_neighbors

an integer specifying the number of neighbors considered for nearest neighbor pairs for local structure preservation. Default is 10.

MN_ratio

a numeric specifying the ratio of mid-near pairs to nearest-neighbor pairs (e.g. n_neighbors=10, MN_ratio=0.5 means 5 mid-near pairs). Mid-near pairs are used for global structure preservation. Default is .5.

FP_ratio

a numeric specifying the ratio of further pairs to nearest-neighbor pairs (e.g. n_neighbors=10, FP_ratio=2 means 20 further pairs). Further pairs are used for both local and global structure preservation. Default is 2.

distance

a character string specifying the distance metric. One of c("euclidean", "manhattan", "angular", "hamming"). Default is "euclidean".

lr

a numeric specifying the learning rate of the Adam optimizer for embedding. Default is 1.

num_iters

an integer specifying the number of iterations for the optimization of embedding. Values greater than 250 are recommended. Default is 450.

verbose

a logical specifying whether to show messages during initialization and fitting. Default is FALSE.

apply_pca:

a logical specifying whether to apply PCA on the data before pair construction. Default is FALSE.

Value

The function returns a matrix containing projected coordinates for each embedding vectors. The matrix has nrow(embedding) rows and n_components columns.

Details

Function wraps around the PaCMAP Python module found at github.com/YingfanWang/PaCMAP. Function adapted from /github.com/milescsmith/ReductionWrappers.

PaCMAP (Pairwise Controlled Manifold Approximation) Maps high-dimensionaldataset to a low-dimensional embedding. For details see jmlr.org/papers/volume22/20-1061/20-1061.pdf.