Expand description
Sorting / order-statistics op family — Category O.
Phase 9 of the baracuda-kernels comprehensive plan. Ships the block-bitonic trailblazer family:
SortPlan/SortBackwardPlan/ArgsortPlan/MsortPlan/MsortBackwardPlan— block-bitonic sort, one CUDA block per row. Trailblazer cap:row_len ≤ 1024. Larger rows are reserved for a future tile-radix follow-up; the Plan returnsUnsupportedforrow_len > 1024.TopkPlan/TopkBackwardPlan/KthvaluePlan/KthvalueBackwardPlan— block-bitonic select; trailblazer cap:k ≤ 64(LLM-inference range).UniquePlan/UniqueConsecutivePlan— set-valued, no BW.uniquechains sort + consecutive-dedup at the plan layer.HistogramPlan/HistogramddPlan/BincountPlan— atomic-bin accumulation; FW only.histogramddreturnsUnsupportedforndim > 1in the trailblazer.SearchsortedPlan— per-query binary search; FW only.
Saved-indices contract for sort / msort / topk / kthvalue BW.
The FW emits both sorted values AND sorted indices in a single
launch (FW Args carry values and indices as required outputs).
BW Args receive the saved indices verbatim — no recomputation
at BW time. The Plan’s selector pegs the indices dtype to i32
across every kernel SKU in this family.
Re-exports§
pub use argsort::ArgsortArgs;pub use argsort::ArgsortDescriptor;pub use argsort::ArgsortPlan;pub use bincount::BincountArgs;pub use bincount::BincountDescriptor;pub use bincount::BincountPlan;pub use histogram::HistogramArgs;pub use histogram::HistogramDescriptor;pub use histogram::HistogramPlan;pub use histogramdd::HistogramddArgs;pub use histogramdd::HistogramddDescriptor;pub use histogramdd::HistogramddPlan;pub use kthvalue::KthvalueArgs;pub use kthvalue::KthvalueDescriptor;pub use kthvalue::KthvaluePlan;pub use kthvalue_backward::KthvalueBackwardArgs;pub use kthvalue_backward::KthvalueBackwardDescriptor;pub use kthvalue_backward::KthvalueBackwardPlan;pub use msort::MsortArgs;pub use msort::MsortBackwardArgs;pub use msort::MsortBackwardDescriptor;pub use msort::MsortBackwardPlan;pub use msort::MsortDescriptor;pub use msort::MsortPlan;pub use searchsorted::SearchsortedArgs;pub use searchsorted::SearchsortedDescriptor;pub use searchsorted::SearchsortedPlan;pub use sort::SortArgs;pub use sort::SortDescriptor;pub use sort::SortPlan;pub use sort_backward::SortBackwardArgs;pub use sort_backward::SortBackwardDescriptor;pub use sort_backward::SortBackwardPlan;pub use topk::TopkArgs;pub use topk::TopkDescriptor;pub use topk::TopkPlan;pub use topk_backward::TopkBackwardArgs;pub use topk_backward::TopkBackwardDescriptor;pub use topk_backward::TopkBackwardPlan;pub use unique::UniqueArgs;pub use unique::UniqueDescriptor;pub use unique::UniquePlan;pub use unique_consecutive::UniqueConsecutiveArgs;pub use unique_consecutive::UniqueConsecutiveDescriptor;pub use unique_consecutive::UniqueConsecutivePlan;
Modules§
- argsort
argsortplan — sorted indices only (no values output).- bincount
bincountplan — count occurrences of each integer inx.- histogram
histogramplan — 1-D uniform-bin atomic-accumulating histogram.- histogramdd
histogramddplan — N-D histogram. Reserved for follow-up.- kthvalue
kthvalueplan — returns the k-th smallest value + its index along the last dimension.- kthvalue_
backward kthvalue_backwardplan — scatter the scalardy[batch]back to the saved-index position indx[batch, row_len].- msort
msort(stable sort) plan + BW.- searchsorted
searchsortedplan — per-query binary search in a 1-D sorted array.- sort
sortplan — Category O trailblazer.- sort_
backward sort_backwardplan — scatterdyvia the saved indices.- topk
topkplan — block-bitonic top-k select.- topk_
backward topk_backwardplan — scatter k-widedyinto row_len-widedxvia the saved indices. Launcher zerosdxfirst.- unique
uniqueplan — sort + consecutive-dedup composition.- unique_
consecutive unique_consecutiveplan — emit one cell per run-start in each row.
Constants§
- SORT_
MAX_ ROW - Maximum supported
row_lenin the block-bitonic trailblazer. Must matchMAX_ROWinbaracuda_sort.cuh. - TOPK_
MAX_ K - Maximum supported
kin the block-bitonic topk trailblazer. Must matchMAX_Kinbaracuda_topk.cuh.