Skip to main content

Module sampling

forgellm_runtime

Module sampling

Expand description

Token sampling strategies for autoregressive generation.

Supports greedy, top-k, top-p (nucleus), and temperature-scaled sampling.

Structs§

SamplingConfig: Sampling configuration.

Functions§

apply_repetition_penalty: Apply repetition penalty to logits for previously generated tokens.
argmax: Greedy sampling: return the token with the highest logit.
sample: Sample a token ID from logits using the given config.