pub fn sample_token(
logits: &[f32],
temperature: f32,
top_k: usize,
_repetition_penalty: f32,
) -> u32Expand description
Sample a token from logits with temperature, top-k, and repetition penalty. Sample a token matching qwen3-tts-rs reference:
- temperature scaling
- top-k filter (keep top_k, rest = -inf)
- top-p filter (keep smallest set with cumprob > top_p, rest = -inf)
- softmax over filtered logits
- multinomial sample from distribution