Expand description
LLVM-backed AOT evaluator for Relon. Phase B production envelope.
This crate is the second slice of the dual-backend strategy:
Cranelift covers the default native AOT route and the LLVM AOT
pipeline here chases Rust-native peak performance for the #main
entry path.
§Scope (Phase B)
- Two entry shapes accepted:
- Legacy-i64 (
(I64...) -> I64) forfrom_ir_directcallers (tests, bench fixtures) — the Phase A bootstrap envelope, retained for cross-backend comparison. - Buffer-protocol (
(*state, i32, i32, i32, i32, i64) -> i32) forfrom_sourcecallers. Matches the cranelift backend’sEntryShape::BufferProtocolso the runtime envelopes line up.
- Legacy-i64 (
- Source-driven pipeline (
from_source): parse + analyze + lower (relon_ir::lower_workspace_single) + LLVM emit + JIT compile + per-call arena dispatch. The cmp_lua W1 / W2 workloads (list.sum(range(n)) / list.sum(range(n).map(…))) go end-to-end through this path. - The op set started with what
lower_workspace_singlesynthesised for the W1 / W2 shape after the IR’srange_pipelinepeephole collapsedrange.map.suminto a single accumulator loop:LocalGet,ConstI64/ConstI32/ConstBool,LetGet/LetSet,LoadField/StoreField(scalar slots),Add/Sub/Mul/Div/Mod/BitAnd(I32 + I64),Eq/Ne/Lt/Le/Gt/Ge, structured control flow (Block/Loop/Br/BrIf/If), andReturn. The emitter has since widened into strings, lists, pointer-indirect fields, host calls, closures, object emission, wasm32 object emission, and several stdlib surfaces; the tests are the authoritative coverage map for those later slices.
§Safety contract
The source-driven buffer-protocol path is the production entry and
carries the backend sandbox contract: capability gates, div/mod
guards, checked signed Int arithmetic, arena bounds checks before
host-pointer formation, dynamic host-call trap lifting, and
deterministic step-budget fuel all report through typed
RuntimeErrors. The legacy / typed-fast i64 entries remain for
focused tests and benchmark kernels; they have no ArenaState error
lane, and public run_main routes trap-capable bodies through the
buffer entry.
§Remaining limits
- Full language surface — tree-walk remains the complete reference implementation. LLVM AOT covers the explicitly tested compiled-backend surface and rejects shapes outside that envelope loudly rather than fabricating partial results.
.o/.soemit + dlopen — Phase B still uses the in-process MCJIT engine. The single-knobOptimizationLevelAPI hides the engine choice so Phase C / ORC migration is a localised diff.
§Decision log (Phase A.1)
Picked inkwell over llvm-sys and external clang/llc:
inkwell0.9.0 with thellvm18-1feature pins llvm-sys 181.3.0 against the system LLVM 18.1.3 install at/usr/lib/llvm-18. Safe Rust wrappers eliminate the per-opunsafeblock the raw FFI path would impose.llvm-syswould force every IR-builder call throughunsaferaw pointer arithmetic — maintainability cost on the AOT widening Phase B/C is too high for the same target set.clang/llcshell-out drops in-memory JIT verification (we want a smoke test to round-trip without writing a file) and bloats cold-start with subprocess fork/exec latency.optpiping also forces stringly-typed IR generation that’s awkward to debug.
Modules§
- cocompile
- Stage 1.B — LTO co-compile backbone (closed-world
CallNative). - str_
helpers - Phase F.1: host shims backing the LLVM AOT string fast-path.
- wasm_
link - S3.X wasm32 link step: turn an LLVM-emitted relocatable wasm
object (
\0asmwith alinkingcustom section, undefined symbols, no exports / no memory) into an instantiable wasm module.
Structs§
- Arena
Regions - Arena region boundaries the in-place decode selects between. The
arena layout (shared by both AOT backends) is
[const_data | pad | in_buf | pad | out_buf | pad | scratch]; the returned root may live in any region (S1 only ever sees the param-sourcedin_buflist, but the selection is generic so S2+ can returnout_buf/scratchroots through the same gate). - Capability
Vtable - The LLVM backend’s capability grant surface.
- Emit
Object Info - Metadata returned by
LlvmAotEvaluator::emit_objectso the build.rs caller can stamp matchingextern "C"declarations and marshalling code into the generated Rust shim. - Emitted
Field - One declared
#mainparameter (orvaluefield on the return schema), in declaration order. Tells the build.rs binding generator what Rust type to expose for each slot and at what byte offset the buffer-protocol arena writer / reader should access it. - Host
FnRegistry - Phase 0b host-fn registry:
import_idx -> Arc<dyn RelonFunction>. - Llvm
AotEvaluator - Phase B LLVM AOT evaluator. Either constructed from a pre-lowered
IR module via
Self::from_ir_direct(legacy-i64 envelope) or from a.relonsource viaSelf::from_source(buffer-protocol envelope). - Sandbox
Config - Compile-time sandbox configuration. Mirrors the cranelift backend’s
SandboxConfigfield-for-field so a side-by-side comparison of the two AOT backends shares the same knob surface. - Wasm
Buffer Dispatch - A planned wasm buffer-protocol dispatch produced by
LlvmAotEvaluator::wasm_buffer_plan: the const-data prefix, the packed input record, and the full arena region layout. The wasm host laysconst_dataat arena offset 0 andin_bytesatregions.in_ptr, invokes the entry symbol it emitted, then decodes viaLlvmAotEvaluator::wasm_buffer_decode.
Enums§
- Codegen
Target - Codegen target for the object-emit path (S3.X).
- Emitted
Entry Shape - Run LLVM’s
-O3middle-end pipeline onmodule. The host-side JIT engine handles backend codegen-time optimisation; this function fills in the IR-level passes (mem2reg, instcombine, gvn, licm, loop-unroll, SLP-vectorize, …) that MCJIT does not invoke on its own. - Emitted
Field Type - Erased canonical type tag the build.rs binding generator uses to
pick the Rust type for each
#mainparameter / return slot. - Llvm
Error - Top-level error type for the LLVM AOT backend. Construction sites fall into four buckets:
- Sandbox
Trap Kind - Trap kind raised by a guard inside LLVM-emitted native code. The
numeric values match the cranelift backend’s
TrapKindand the [crate::state::NativeTrap] subset the JIT-side dynamic dispatch helper already records, so the host decodes the same cause numbering across backends. Encoded asu64so it fits theArenaState::trap_codeslot the emitted object writes throughrelon_llvm_call_native/ theOp::CheckCaptrap arm. - Vtable
Slot - One slot per host helper the LLVM codegen indirects through, in the same order cranelift pins them in its data-vtable. Adding a new helper appends a variant (NEVER reorder existing variants).
- World
Mode - Stage 1.B: whether
Op::CallNativelowers to open-world dynamic dispatch (therelon_llvm_call_nativehelper resolved at runtime viaadd_global_mapping) or closed-world static dispatch (a directcall @<host_symbol>to anexterndeclaration the LTO co-compile step later links + inlines).
Constants§
- GENERATOR_
VERSION - Generator stamp for the LLVM-AOT codegen, the mirror of
relon_codegen_cranelift::GENERATOR_VERSION.
Functions§
- populate_
global_ mappings - Resolve every slot to its
(symbol, host_addr)binding. The evaluator iterates this to registeradd_global_mappings for the helpers the emitted module actually references (it only declares a helper’s extern on first use, so the caller filters bymodule.get_function(symbol).is_some()before binding). The LLVM analogue of cranelift’spopulate_vtable, which writes every active slot’s fn pointer into the data section unconditionally.