Expand description
Sampling strategies for text generation.
Supports temperature scaling, top-k filtering, top-p (nucleus) filtering,
and repetition penalty. The Sampler converts a logit vector into a
single token ID using these strategies in order:
- Temperature scaling — divide logits by temperature (0 = greedy argmax)
- Top-k — keep only the k highest-probability candidates
- Softmax — convert scaled logits to probabilities
- Top-p — keep the smallest set of tokens whose cumulative probability exceeds p
- Weighted random selection — sample from the filtered distribution
Structs§
- Sampler
- Token sampler.
- Sampling
Params - Sampling parameters.