Expand description
Sampling strategies for generative model token selection.
Provides greedy decoding, temperature sampling, top-k, top-p (nucleus), and a configurable sampler combining all of the above with repetition penalty.
Structs§
- Configurable
Sampler - A sampler that combines temperature scaling, top-k filtering, top-p (nucleus) filtering, and repetition penalty into a single configurable pipeline.
- Greedy
Decoder - Always selects the token with the highest logit (argmax decoding).
- Sampled
Token - The result of sampling a single token.
- Sampling
Config - Configuration for the
ConfigurableSampler. - Temperature
Sampler - Samples from a softmax distribution after dividing logits by
temperature. - TopK
Sampler - Zeroes out all logits except the top-k, then applies temperature sampling.
- TopP
Sampler - Nucleus (top-p) sampler: keeps the smallest set of tokens whose cumulative
probability is at least
p, then samples from that nucleus.
Enums§
- Sampling
Error - Errors that can occur during sampling operations.
Functions§
- entropy
- Shannon entropy of a probability distribution (in nats).
- log_
softmax - Compute log-softmax: log(softmax(x)) with the log-sum-exp trick.
- perplexity
- Perplexity: exp(mean negative log-prob) over a sequence of log-probabilities.
- softmax
- Compute softmax with the log-sum-exp trick for numerical stability.