Expand description
Token sampling strategies for autoregressive generation.
After the model produces logits (raw scores) over the vocabulary, we need to select the next token. This module provides:
- Greedy decoding: Always pick the highest-probability token.
- Temperature scaling: Control randomness by dividing logits by temperature.
- Top-p (nucleus) sampling: Sample from the smallest set of tokens whose
cumulative probability exceeds
p, preventing unlikely tokens from being chosen.
§Example
ⓘ
let mut sampler = Sampler::new(0.7, 0.9, 42);
let next_token = sampler.sample(&mut logits);Structs§
- Sampler
- Configurable token sampler with temperature and top-p support.