Expand description
Global Reduction Primitives
This module provides traits and types for GPU-accelerated reduction operations. Reductions aggregate values across all GPU threads using operations like sum, min, max, etc.
§Use Cases
- PageRank: Sum dangling node contributions across all nodes
- Graph algorithms: Compute convergence metrics, global norms
- Scientific computing: Vector norms, dot products, energy calculations
§Architecture
Reductions use a hierarchical approach for efficiency:
- Warp-level: Use shuffle instructions for fast intra-warp reduction
- Block-level: Tree reduction in shared memory with
__syncthreads() - Grid-level: Atomic accumulation from block leaders, then broadcast
§Example
ⓘ
use ringkernel_core::reduction::{ReductionOp, GlobalReduction};
// In kernel code (DSL):
let my_contrib = if out_degree[idx] == 0 { rank } else { 0.0 };
let dangling_sum = reduce_and_broadcast(my_contrib, &accumulator);
let new_rank = base + damping * (incoming + dangling_sum / n);Structs§
- Reduction
Config - Configuration for reduction operations.
Enums§
- Reduction
Op - Reduction operation types.
Traits§
- Global
Reduction - Trait for GPU runtimes that support global reduction operations.
- Reduction
Handle - Handle to a reduction buffer for streaming operations.
- Reduction
Scalar - Trait for scalar types that support reduction operations.