rlx-flow
Block assembly-line API for RLX model builders — no HIR/Graph in model code unless you opt into tier 2.
Three tiers
| Tier | You write | Example |
|---|---|---|
| 0 | ModelFlow + small blocks |
.token_embed().layer("L0", |s| s.rms_norm(...)) |
| 1 | CompileProfile / *.rlx.toml |
fusion, precision, passes |
| 2 | .custom() + rlx_flow::escape::Emit |
novel subgraphs — promote to blocks when stable |
Model author quick start
use *;
use ;
// Generic causal LM skeleton
let flow = new
.profile_prefill
.input
.rope_tables
.zero_beta
.token_embed
.repeat_layers
.final_norm
.lm_head;
let built = flow.build?;
Small composable blocks
| Block | DSL | Purpose |
|---|---|---|
token_embed |
.token_embed() |
HF embedding table |
rms_norm |
.rms_norm(key, eps) |
pre-norm |
layer_norm |
.layer_norm(gamma, beta, eps) |
BERT-style LayerNorm |
gelu_ffn |
.gelu_ffn(layer_prefix) |
BERT GELU FFN |
bert_encoder_layer |
.bert_encoder_layer(spec) / .repeat_bert_layers(...) |
full BERT encoder layer |
nomic_encoder_layer |
.repeat_nomic_layers(...) |
NomicBERT RoPE + SwiGLU layer |
gather_add |
.gather_add(input, weight) |
add position/type embeddings |
linear |
.linear(key, transpose) |
matmul |
residual_save / residual_add |
.residual_save() … .residual_add() |
skip connections |
self_attn_prefill |
.self_attn_prefill(spec) |
QKV + RoPE + GQA + causal |
swiglu |
.swiglu_hf_mlp(prefix) |
SwiGLU FFN |
LayerStack |
.layer("L0", |s| s....) |
named sub-layer composer |
Fused fast path: llama_prefill_layer_fused(i, spec) → one HIR composite.
Composable path: llama_prefill_layer_composed(i, spec) → small blocks above.
LLaMA 3.2 (recommended)
// Llama32Flow lives in the model-builders repo (see root README).
for_prefill
.last_token_logits
.profile_near
.build?;
Customize one layer without IR:
for_prefill
.layer
.build?;
See DESIGN.md.