morok-schedule
Pattern matching engine and RANGEIFY transformation for kernel generation.
Example
use ;
let matcher = patterns! ;
let optimized = graph_rewrite;
Features
Supported:
- Declarative
patterns!DSL for rewrite rules - Fixed-point graph rewriting
- RANGEIFY: movement ops -> BUFFERIZE+INDEX -> kernels
- Symbolic simplification (17+ pattern categories)
- Heuristic-based kernel optimization
Planned:
- BEAM search optimization
- Multi-device kernel splitting
Optimizations
| Optimization | Status | Description |
|---|---|---|
| Symbolic Simplification | ||
| Constant folding | ✅ | Fold unary/binary/ternary ops on constants |
| Identity elimination | ✅ | x+0, x*1, x/1, x |
| Zero propagation | ✅ | x*0, x&0 → 0 |
| Self-folding | ✅ | x//x→1, x&x→x, x |
| ALU folding | ✅ | (x+c1)+c2 → x+(c1+c2) |
| Term combining | ✅ | x+x→2x, (c1x)+(c2*x)→(c1+c2)*x |
| Division distribution | ✅ | (a+b)//c → a//c + b//c when exact |
| Dead code elimination | ✅ | WHERE(true,t,f)→t, dead loops |
| Kernel Optimization | ||
| Tensor cores (TC) | ✅ | WMMA for matmul patterns |
| Upcasting | ✅ | Vectorization (float4, etc.) |
| Loop unrolling | ✅ | Unroll reductions ≤32 |
| Local memory | ✅ | GPU shared memory allocation |
| Grouped reduction | ✅ | 2-stage GROUP/GROUPTOP |
| Matvec optimization | ✅ | Specialized MV pattern |
| CPU threading | ✅ | Multi-threaded execution |
| Axis reordering | ✅ | SWAP for memory access |
| RANGEIFY | ||
| Movement op removal | ✅ | RESHAPE, PERMUTE, EXPAND, etc. |
| Buffer folding | ✅ | Remove noop BUFFERIZE |
| Dead axis removal | ✅ | Eliminate unused dimensions |
| Range flattening | ✅ | Flatten nested RANGE ops |
| Reduce simplification | ✅ | Optimize reduction patterns |
| Kernel splitting | ✅ | Split by STORE operations |
| Reduce splitting | ✅ | 2-stage large reductions |
| Buffer cost analysis | ✅ | PContig cost model |
| Planned | ||
| BEAM search | ❌ | Exhaustive optimization search |
| Image float4 | ❌ | Image type vectorization |
| Multi-device | ❌ | Ring allreduce, multi-rank |
| Kernel caching | ❌ | Compile cache |
| PADTO | ❌ | Axis padding for alignment |
Testing
Property-Based Testing
Uses proptest to verify algebraic properties across randomly generated expression trees (1000+ cases per property):
| Category | Properties |
|---|---|
| Identity | x+0→x, x*1→x, x-0→x, x/1→x, x |
| Zero | x*0→0, x&0→0, 0/x→0 |
| Self-fold | x/x→1, x&x→x, x |
| Structural | Idempotence, dtype preservation, cost bounds |
Z3 Verification
Uses Z3 SMT solver to formally prove semantic equivalence of rewrites. Z3 is a theorem prover from Microsoft Research that can verify mathematical properties by constraint solving.
How it works:
- Convert original and simplified UOps to Z3 expressions
- Assert
NOT(original == simplified) - If UNSAT → rewrite is proven correct
- If SAT → found counterexample (bug)
Resources: