Module sampling

Expand description

Sampling strategies for text generation.

Supports temperature scaling, top-k filtering, top-p (nucleus) filtering, and repetition penalty. The Sampler converts a logit vector into a single token ID using these strategies in order:

Temperature scaling — divide logits by temperature (0 = greedy argmax)
Top-k — keep only the k highest-probability candidates
Softmax — convert scaled logits to probabilities
Top-p — keep the smallest set of tokens whose cumulative probability exceeds p
Weighted random selection — sample from the filtered distribution

Structs§

Sampler: Token sampler.
SamplingParams: Sampling parameters.

Module sampling

Module sampling Copy item path

Structs§

Module sampling