Skip to main content

Crate relon_codegen_llvm

Crate relon_codegen_llvm 

Source
Expand description

LLVM-backed AOT evaluator for Relon. Phase B production envelope.

This crate is the second slice of the dual-backend strategy: Cranelift covers the default native AOT route and the LLVM AOT pipeline here chases Rust-native peak performance for the #main entry path.

§Scope (Phase B)

  • Two entry shapes accepted:
    • Legacy-i64 ((I64...) -> I64) for from_ir_direct callers (tests, bench fixtures) — the Phase A bootstrap envelope, retained for cross-backend comparison.
    • Buffer-protocol ((*state, i32, i32, i32, i32, i64) -> i32) for from_source callers. Matches the cranelift backend’s EntryShape::BufferProtocol so the runtime envelopes line up.
  • Source-driven pipeline (from_source): parse + analyze + lower (relon_ir::lower_workspace_single) + LLVM emit + JIT compile + per-call arena dispatch. The cmp_lua W1 / W2 workloads (list.sum(range(n)) / list.sum(range(n).map(…))) go end-to-end through this path.
  • The op set started with what lower_workspace_single synthesised for the W1 / W2 shape after the IR’s range_pipeline peephole collapsed range.map.sum into a single accumulator loop: LocalGet, ConstI64 / ConstI32 / ConstBool, LetGet / LetSet, LoadField / StoreField (scalar slots), Add / Sub / Mul / Div / Mod / BitAnd (I32 + I64), Eq / Ne / Lt / Le / Gt / Ge, structured control flow (Block / Loop / Br / BrIf / If), and Return. The emitter has since widened into strings, lists, pointer-indirect fields, host calls, closures, object emission, wasm32 object emission, and several stdlib surfaces; the tests are the authoritative coverage map for those later slices.

§Safety contract

The source-driven buffer-protocol path is the production entry and carries the backend sandbox contract: capability gates, div/mod guards, checked signed Int arithmetic, arena bounds checks before host-pointer formation, dynamic host-call trap lifting, and deterministic step-budget fuel all report through typed RuntimeErrors. The legacy / typed-fast i64 entries remain for focused tests and benchmark kernels; they have no ArenaState error lane, and public run_main routes trap-capable bodies through the buffer entry.

§Remaining limits

  • Full language surface — tree-walk remains the complete reference implementation. LLVM AOT covers the explicitly tested compiled-backend surface and rejects shapes outside that envelope loudly rather than fabricating partial results.
  • .o / .so emit + dlopen — Phase B still uses the in-process MCJIT engine. The single-knob OptimizationLevel API hides the engine choice so Phase C / ORC migration is a localised diff.

§Decision log (Phase A.1)

Picked inkwell over llvm-sys and external clang/llc:

  • inkwell 0.9.0 with the llvm18-1 feature pins llvm-sys 181.3.0 against the system LLVM 18.1.3 install at /usr/lib/llvm-18. Safe Rust wrappers eliminate the per-op unsafe block the raw FFI path would impose.
  • llvm-sys would force every IR-builder call through unsafe raw pointer arithmetic — maintainability cost on the AOT widening Phase B/C is too high for the same target set.
  • clang/llc shell-out drops in-memory JIT verification (we want a smoke test to round-trip without writing a file) and bloats cold-start with subprocess fork/exec latency. opt piping also forces stringly-typed IR generation that’s awkward to debug.

Modules§

cocompile
Stage 1.B — LTO co-compile backbone (closed-world CallNative).
str_helpers
Phase F.1: host shims backing the LLVM AOT string fast-path.
wasm_link
S3.X wasm32 link step: turn an LLVM-emitted relocatable wasm object (\0asm with a linking custom section, undefined symbols, no exports / no memory) into an instantiable wasm module.

Structs§

ArenaRegions
Arena region boundaries the in-place decode selects between. The arena layout (shared by both AOT backends) is [const_data | pad | in_buf | pad | out_buf | pad | scratch]; the returned root may live in any region (S1 only ever sees the param-sourced in_buf list, but the selection is generic so S2+ can return out_buf / scratch roots through the same gate).
CapabilityVtable
The LLVM backend’s capability grant surface.
EmitObjectInfo
Metadata returned by LlvmAotEvaluator::emit_object so the build.rs caller can stamp matching extern "C" declarations and marshalling code into the generated Rust shim.
EmittedField
One declared #main parameter (or value field on the return schema), in declaration order. Tells the build.rs binding generator what Rust type to expose for each slot and at what byte offset the buffer-protocol arena writer / reader should access it.
HostFnRegistry
Phase 0b host-fn registry: import_idx -> Arc<dyn RelonFunction>.
LlvmAotEvaluator
Phase B LLVM AOT evaluator. Either constructed from a pre-lowered IR module via Self::from_ir_direct (legacy-i64 envelope) or from a .relon source via Self::from_source (buffer-protocol envelope).
SandboxConfig
Compile-time sandbox configuration. Mirrors the cranelift backend’s SandboxConfig field-for-field so a side-by-side comparison of the two AOT backends shares the same knob surface.
WasmBufferDispatch
A planned wasm buffer-protocol dispatch produced by LlvmAotEvaluator::wasm_buffer_plan: the const-data prefix, the packed input record, and the full arena region layout. The wasm host lays const_data at arena offset 0 and in_bytes at regions.in_ptr, invokes the entry symbol it emitted, then decodes via LlvmAotEvaluator::wasm_buffer_decode.

Enums§

CodegenTarget
Codegen target for the object-emit path (S3.X).
EmittedEntryShape
Run LLVM’s -O3 middle-end pipeline on module. The host-side JIT engine handles backend codegen-time optimisation; this function fills in the IR-level passes (mem2reg, instcombine, gvn, licm, loop-unroll, SLP-vectorize, …) that MCJIT does not invoke on its own.
EmittedFieldType
Erased canonical type tag the build.rs binding generator uses to pick the Rust type for each #main parameter / return slot.
LlvmError
Top-level error type for the LLVM AOT backend. Construction sites fall into four buckets:
SandboxTrapKind
Trap kind raised by a guard inside LLVM-emitted native code. The numeric values match the cranelift backend’s TrapKind and the [crate::state::NativeTrap] subset the JIT-side dynamic dispatch helper already records, so the host decodes the same cause numbering across backends. Encoded as u64 so it fits the ArenaState::trap_code slot the emitted object writes through relon_llvm_call_native / the Op::CheckCap trap arm.
VtableSlot
One slot per host helper the LLVM codegen indirects through, in the same order cranelift pins them in its data-vtable. Adding a new helper appends a variant (NEVER reorder existing variants).
WorldMode
Stage 1.B: whether Op::CallNative lowers to open-world dynamic dispatch (the relon_llvm_call_native helper resolved at runtime via add_global_mapping) or closed-world static dispatch (a direct call @<host_symbol> to an extern declaration the LTO co-compile step later links + inlines).

Constants§

GENERATOR_VERSION
Generator stamp for the LLVM-AOT codegen, the mirror of relon_codegen_cranelift::GENERATOR_VERSION.

Functions§

populate_global_mappings
Resolve every slot to its (symbol, host_addr) binding. The evaluator iterates this to register add_global_mappings for the helpers the emitted module actually references (it only declares a helper’s extern on first use, so the caller filters by module.get_function(symbol).is_some() before binding). The LLVM analogue of cranelift’s populate_vtable, which writes every active slot’s fn pointer into the data section unconditionally.