Expand description
Rayon-backed parallel for: par_for(total, grain, |off, cnt| …).
Replaces the old per-worker Condvar pool with Rayon’s work-stealing
scheduler. Same (offset, count) chunk API so all existing call
sites (BLAS tiling, SDPA, LayerNorm, …) pick up Rayon without
changes.
Functions§
- num_
threads - Total Rayon worker count (configured from [
RuntimeConfig::pool_workers]). - par_for
- Parallel for: split
totalitems across threads.f(off, cnt)is called once per chunk with disjoint regions.