morok-schedule 0.1.0-alpha.2

Optimization passes and pattern engine for the Morok ML compiler
Documentation

morok-schedule

Pattern matching engine and RANGEIFY transformation for kernel generation.

Example

use morok_schedule::{patterns, graph_rewrite};

let matcher = patterns! {
    // Identity folding
    Add[x, @zero] ~> x,
    Mul[x, @one] ~> x,

    // Constant folding (fallible)
    Add(a @const(a_val), b @const(b_val))
      => eval_add(a_val, b_val).map(|r| UOp::const_(a.dtype(), r)),

    // Self-folding
    Neg(Neg(x)) ~> Rc::clone(x),
};
let optimized = graph_rewrite(&matcher, graph, &mut ());

Features

Supported:

  • Declarative patterns! DSL for rewrite rules
  • Fixed-point graph rewriting
  • RANGEIFY: movement ops -> BUFFERIZE+INDEX -> kernels
  • Symbolic simplification (17+ pattern categories)
  • Heuristic-based kernel optimization

Planned:

  • BEAM search optimization
  • Multi-device kernel splitting

Optimizations

Optimization Status Description
Symbolic Simplification
Constant folding Fold unary/binary/ternary ops on constants
Identity elimination x+0, x*1, x/1, x
Zero propagation x*0, x&0 → 0
Self-folding x//x→1, x&x→x, x
ALU folding (x+c1)+c2 → x+(c1+c2)
Term combining x+x→2x, (c1x)+(c2*x)→(c1+c2)*x
Division distribution (a+b)//c → a//c + b//c when exact
Dead code elimination WHERE(true,t,f)→t, dead loops
Kernel Optimization
Tensor cores (TC) WMMA for matmul patterns
Upcasting Vectorization (float4, etc.)
Loop unrolling Unroll reductions ≤32
Local memory GPU shared memory allocation
Grouped reduction 2-stage GROUP/GROUPTOP
Matvec optimization Specialized MV pattern
CPU threading Multi-threaded execution
Axis reordering SWAP for memory access
RANGEIFY
Movement op removal RESHAPE, PERMUTE, EXPAND, etc.
Buffer folding Remove noop BUFFERIZE
Dead axis removal Eliminate unused dimensions
Range flattening Flatten nested RANGE ops
Reduce simplification Optimize reduction patterns
Kernel splitting Split by STORE operations
Reduce splitting 2-stage large reductions
Buffer cost analysis PContig cost model
Planned
BEAM search Exhaustive optimization search
Image float4 Image type vectorization
Multi-device Ring allreduce, multi-rank
Kernel caching Compile cache
PADTO Axis padding for alignment

Testing

cargo test -p morok-schedule
cargo test -p morok-schedule --features z3  # with Z3 verification

Property-Based Testing

Uses proptest to verify algebraic properties across randomly generated expression trees (1000+ cases per property):

Category Properties
Identity x+0→x, x*1→x, x-0→x, x/1→x, x
Zero x*0→0, x&0→0, 0/x→0
Self-fold x/x→1, x&x→x, x
Structural Idempotence, dtype preservation, cost bounds

Z3 Verification

Uses Z3 SMT solver to formally prove semantic equivalence of rewrites. Z3 is a theorem prover from Microsoft Research that can verify mathematical properties by constraint solving.

How it works:

  1. Convert original and simplified UOps to Z3 expressions
  2. Assert NOT(original == simplified)
  3. If UNSAT → rewrite is proven correct
  4. If SAT → found counterexample (bug)
cargo test -p morok-schedule --features z3

Resources: