Expand description
GPU-accelerated argsort (descending) for MoE top-K routing.
Sorts indices by value in descending order using a bitonic sort kernel. For MoE with N <= 128 experts per row, this fits in a single threadgroup.
Statics§
- ARGSORT_
SHADER_ SOURCE - MSL source for the argsort kernel (embedded at compile time).
Functions§
- dispatch_
argsort_ desc_ f32 - Dispatch an argsort (descending) operation on the GPU.
- register
- Register argsort shader source with the given kernel registry.