Skip to main content

Module sampling

Module sampling 

Source
Expand description

Sampling strategies for text generation.

Supports temperature scaling, top-k filtering, top-p (nucleus) filtering, and repetition penalty. The Sampler converts a logit vector into a single token ID using these strategies in order:

  1. Temperature scaling — divide logits by temperature (0 = greedy argmax)
  2. Top-k — keep only the k highest-probability candidates
  3. Softmax — convert scaled logits to probabilities
  4. Top-p — keep the smallest set of tokens whose cumulative probability exceeds p
  5. Weighted random selection — sample from the filtered distribution

Structs§

Sampler
Token sampler.
SamplingParams
Sampling parameters.