er_infer_labels
infers category labels using generative large language models.
er_infer_labels(
labels,
api = "huggingface",
model = NULL,
role = "assistant",
instruct = NULL,
system = NULL,
verbose = FALSE
)
a list
of character vectors.
a character
string specifying the api One of c("huggingface","openai","cohere")
. Default is "openai"
.
a character
string specifying the model label. Must match the model names on the corresponding APIs. See, huggingface.co/models and platform.openai.com/docs/models/embeddings. Defaults to "meta-llama/Llama-2-70b-chat-hf"
for api = "huggingface"
and to "gpt-4"
for api = "openai"
.
a character
string specifying the systems role in place of role
in the general system instruction to the model. Default is "assistant"
.
a character
string specifying the instruction for the model. Must contain the placeholder "{examples}"
. Default is "Generate a specific and accurate one or two word category label that captures the common meaning of the following examples: {examples}. Place '@' before and after the category label."
.
a character
string specifying the general system instruction to the model. Default is "You are a helpful {role} who provides short, specific, and accurate category labels."
.
a logical
specifying whether to show messages.'
The function returns a character
vector of category labels.
The models recommended for label inferences, including the default models, are not free for use and using them can result in significant costs. Costs will depend on the size of input texts and the number of labels inferred. The default Hugging Face model, meta-llama/Llama-2-70b-chat-hf
, requires a PRO subscription at a monthly price. The OpenAI models, including the default gpt-4
model, incur costs based on the number of tokens in the input and output.
To obtain the best possible labels it is recommended to adjust the prompt arguments role
, system
, and instruct
.
Wulff, D. U., Aeschbach, S., Hussain, Z., & Mata, R. (2024). embeddeR. psyArXiv