Skip to main content

Module kernel_fusion

Module kernel_fusion 

Source
Expand description

Kernel Fusion Engine

Automatically detects and fuses element-wise / pointwise kernel sequences to eliminate intermediate memory allocations and round-trips. This mirrors the kernel fusion passes in TensorRT, XLA, and TVM.

Fusion rules:

  1. Element-wise ops (add, mul, relu, etc.) can always fuse.
  2. Reduction followed by broadcast can fuse (vertical fusion).
  3. Producer-consumer pairs with matching shapes can fuse (horizontal).

Structs§

FusedKernel
A fused kernel — a sequence of operations executed as one kernel.
FusionAnalyzer
Fusion analysis engine that detects fusable patterns.
FusionNode
A node in the fusion graph.
FusionResult
Result of fusion analysis.

Enums§

BinaryOp
Binary element-wise operations.
FusableOp
An operation that can be part of a fused kernel.
MemOp
Memory operations.
PrecisionType
Precision types for cast operations.
ReduceOp
Reduction operations.
UnaryOp
Unary element-wise operations.