pub fn sampling_ptx() -> &'static str
PTX assembly for greedy sampling (argmax reduction).
Uses parallel reduction to find the maximum logit index.