ringkernel-ir
Unified Intermediate Representation for RingKernel GPU code generation.
Overview
ringkernel-ir provides a unified IR that serves as the foundation for multi-backend GPU code generation. The IR is SSA-based and captures GPU-specific operations, enabling optimization passes before lowering to target backends (CUDA, WGSL, MSL).
Architecture
Rust DSL (syn::ItemFn)
│
▼
┌─────────┐
│ IrBuilder │ ← Construct IR from Rust AST
└────┬────┘
│
▼
┌─────────┐
│ IrModule │ ← SSA-based representation
└────┬────┘
│
▼
┌─────────────┐
│ PassManager │ ← Optimization passes
└────┬────────┘
│
├──────────────┬──────────────┐
▼ ▼ ▼
┌─────────┐ ┌─────────┐ ┌─────────┐
│ CUDA │ │ WGSL │ │ MSL │
│ Lowering│ │ Lowering│ │ Lowering│
└─────────┘ └─────────┘ └─────────┘
Installation
Add to your Cargo.toml:
[]
= "0.2"
Usage
Building IR
use ;
let mut builder = new;
// Define parameters
let x = builder.parameter;
let y = builder.parameter;
let a = builder.parameter;
let n = builder.parameter;
// Get thread index
let idx = builder.thread_id;
// Bounds check
let in_bounds = builder.lt;
builder.if_then;
let module = builder.build;
Optimization Passes
use ;
// Apply default optimizations
let optimized = optimize;
// Or use PassManager for fine-grained control
let mut pm = new;
pm.add_pass;
pm.add_pass;
let result = pm.run;
Available passes:
- ConstantFolding - Evaluate compile-time constant expressions
- DeadCodeElimination - Remove unused values
- DeadBlockElimination - Remove unreachable basic blocks
- AlgebraicSimplification - Simplify arithmetic expressions (x * 1 → x, x + 0 → x)
Lowering to Backends
use ;
// Lower to CUDA C
let cuda_code = lower_to_cuda?;
// Lower to WGSL
let wgsl_code = lower_to_wgsl?;
// Lower to Metal Shading Language
let msl_code = lower_to_msl?;
Backend Capabilities
Check what features each backend supports:
use ;
let caps = cuda;
if caps.has else
Validation
use ;
let validator = new;
let result = validator.validate;
if !result.is_valid
Pretty Printing
use IrPrinter;
let ir_text = new.print;
println!;
Output:
define kernel saxpy(x: ptr<f32>, y: ptr<f32>, a: f32, n: i32) {
bb0:
%0 = thread_id.x
%1 = lt %0, n
br_if %1, bb1, bb2
bb1:
%2 = load x[%0]
%3 = load y[%0]
%4 = mul a, %2
%5 = add %4, %3
store y[%0], %5
br bb2
bb2:
return
}
IR Node Types
Values
- Constant - Literal values (integers, floats, booleans)
- Parameter - Function parameters
- BinaryOp - Arithmetic and comparison operations
- UnaryOp - Negation, bitwise complement
- Load/Store - Memory access operations
- Cast - Type conversions
GPU-Specific
- ThreadId - Thread index (X, Y, Z dimensions)
- BlockId - Block index
- BlockDim - Block size
- GridDim - Grid size
- SyncThreads - Block-level barrier
- Atomic - Atomic memory operations
- SharedMemory - Shared memory allocation
Control Flow
- Branch - Unconditional branch
- BranchIf - Conditional branch
- Return - Function return
Features
Enable optional features in Cargo.toml:
[]
= { = "0.2", = ["validation", "optimization"] }
- validation - Enable detailed IR validation
- optimization - Enable IR optimization passes
License
Licensed under Apache-2.0 OR MIT.