pub fn generate_inline_grid_reduce( value_expr: &str, shared_array: &str, accumulator: &str, ty: &str, block_size: u32, op: &ReductionOp, ) -> String
Generate inline grid reduction with atomic accumulation.