Expand description
Loop fusion optimization pass.
This module provides optimization passes that fuse multiple loops/reductions over the same axes to improve cache locality and reduce memory traffic.
§Overview
Loop fusion combines multiple consecutive operations that iterate over the same axis into a single fused operation. This optimization:
- Reduces memory traffic (fewer intermediate tensors)
- Improves cache locality (better temporal locality)
- Reduces loop overhead (fewer loop iterations)
§Fusion Criteria
Two loops can be fused if:
- They iterate over the same axis/axes
- They have compatible domains
- There are no dependencies that prevent fusion
- The fused operation doesn’t exceed memory constraints
§Examples
use tensorlogic_compiler::passes::fuse_loops;
use tensorlogic_ir::EinsumGraph;
let graph = EinsumGraph::new();
let (fused_graph, stats) = fuse_loops(&graph);Structs§
- Loop
Fusion Config - Configuration for loop fusion optimization.
- Loop
Fusion Stats - Statistics from loop fusion optimization.
Functions§
- estimate_
fusion_ benefit - Estimate the benefit of fusing a group of nodes.
- fuse_
loops - Fuse loops in an einsum graph.
- fuse_
loops_ with_ config - Fuse loops with custom configuration.