wasm-pvm-cli 0.5.2

CLI for WASM to PVM recompiler
wasm-pvm-cli-0.5.2 is not a library.

WASM-PVM: WebAssembly to PolkaVM Recompiler

WARNING: This project is largely vibe-coded. It was built iteratively with heavy AI assistance (Claude). While it has 412 passing integration tests and produces working PVM bytecode, the internals may contain unconventional patterns, over-engineering in some places, and under-engineering in others. Use at your own risk. Contributions and proper engineering reviews are very welcome!

A Rust compiler that translates WebAssembly (WASM) bytecode into PolkaVM (PVM) bytecode for execution on the JAM (Join-Accumulate Machine) protocol. Write your JAM programs in AssemblyScript (TypeScript-like), hand-written WAT, or any language that compiles to WASM — and run them on PVM.

WASM  ──►  LLVM IR  ──►  PVM bytecode  ──►  JAM program (.jam)
      inkwell    mem2reg       Rust backend

Getting Started

Prerequisites

  • Rust (stable, edition 2024)
  • LLVM 18 — the compiler uses inkwell (LLVM 18 bindings)
    • macOS: brew install llvm@18 then export LLVM_SYS_181_PREFIX=/opt/homebrew/opt/llvm@18
    • Ubuntu: apt install llvm-18-dev
  • Bun (for running integration tests and the JAM runner) — bun.sh

Build

git clone https://github.com/tomusdrw/wasm-pvm.git
cd wasm-pvm
cargo build --release

Hello World: Compile & Run

Create a simple WAT program that adds two numbers:

;; add.wat
(module
  (memory 1)
  (func (export "main") (param $args_ptr i32) (param $args_len i32) (result i64)
    ;; Read two i32 args, add them, write result to memory
    (i32.store (i32.const 0)
      (i32.add
        (i32.load (local.get $args_ptr))
        (i32.load (i32.add (local.get $args_ptr) (i32.const 4)))))
    (i64.const 17179869184)))  ;; packed ptr=0, len=4

Compile it to a JAM blob and run it:

# Compile WAT → JAM
cargo run -p wasm-pvm-cli -- compile add.wat -o add.jam

# Run with two u32 arguments: 5 and 7 (little-endian hex)
npx @fluffylabs/anan-as run add.jam 0500000007000000
# Output: 0c000000  (12 in little-endian)

Inspect the Output

Upload the resulting .jam file to the PVM Debugger for step-by-step execution, disassembly, register inspection, and gas metering visualization.

AssemblyScript Example

You can also write programs in AssemblyScript:

// fibonacci.ts
export function main(args_ptr: i32, args_len: i32): i64 {
  const buf = heap.alloc(256);
  let n = load<i32>(args_ptr);
  let a: i32 = 0;
  let b: i32 = 1;

  while (n > 0) {
    b = a + b;
    a = b - a;
    n = n - 1;
  }

  store<i32>(buf, a);
  return (buf as i64) | ((4 as i64) << 32);  // packed ptr + len
}

Compile via the AssemblyScript compiler to WASM, then use wasm-pvm-cli to produce a JAM blob. See the tests/fixtures/assembly/ directory for more examples.

How It Works

The compiler pipeline:

Entry functions use a unified ABI: main(args_ptr: i32, args_len: i32) -> i64, where the return value packs the result pointer in the lower 32 bits and the result length in the upper 32 bits. The compiler unpacks this into PVM's SPI convention (r7 = start address, r8 = end address).

  1. Adapter merge (optional) — merges a WAT adapter module into the WASM binary, replacing matching imports with adapter function bodies
  2. WASM → LLVM IR — translates WASM opcodes to LLVM IR using inkwell (LLVM 18 bindings), with PVM-specific intrinsics for memory operations
  3. LLVM optimization passesmem2reg (SSA promotion), instcombine, simplifycfg, gvn, dce, and optional function inlining
  4. LLVM IR → PVM bytecode — a custom Rust backend reads LLVM IR and emits PVM instructions with per-block register caching (store-load forwarding)
  5. SPI assembly — packages the bytecode into a JAM/SPI program blob with entry headers, jump tables, and data sections

Key Design Decisions

  • Stack-slot approach with register allocation: every SSA value gets a dedicated 8-byte memory offset from SP. A linear-scan register allocator assigns high-use values to available callee-saved registers r9-r12 when not used for this function's incoming parameters (and reserves r9+ needed for outgoing call arguments in non-leaf functions) to eliminate redundant memory traffic across block boundaries and loops
  • Per-block register cache: eliminates redundant loads when a value is reused shortly after being computed (~50% gas reduction)
  • No unsafe code: deny(unsafe_code) enforced at workspace level
  • No floating point: PVM lacks FP support; WASM floats are rejected at compile time
  • All optimizations are toggleable: --no-llvm-passes, --no-peephole, --no-register-cache, --no-icmp-fusion, --no-shrink-wrap, --no-dead-store-elim, --no-const-prop, --no-inline, --no-cross-block-cache, --no-register-alloc, --no-fallthrough-jumps

Benchmark: Optimizations Impact

All PVM-level optimizations enabled (default):

Benchmark WASM size JAM size Code size Gas Used
add(5,7) 68 B 201 B 130 B 39
fib(20) 110 B 270 B 186 B 612
factorial(10) 102 B 242 B 161 B 269
is_prime(25) 162 B 328 B 239 B 80
AS fib(10) 234 B 708 B 572 B 324
AS factorial(7) 233 B 697 B 562 B 281
AS gcd(2017,200) 228 B 686 B 558 B 190
AS decoder 1.5 KB 20.8 KB 6.8 KB 721
AS array 1.4 KB 19.9 KB 6.0 KB 623
aslan-fib accumulate 7.8 KB 37.1 KB 17.6 KB 15,968
anan-as PVM interpreter 57.7 KB 180.2 KB 127.8 KB -

PVM-in-PVM: programs executed inside the anan-as PVM interpreter (outer gas cost):

Benchmark JAM Size Code Size Outer Gas Direct Gas Overhead
TRAP (interpreter overhead) 21 B 1 B 80,577 - -
add(5,7) 201 B 130 B 1,238,302 39 31,751x
AS fib(10) 708 B 572 B 1,753,546 324 5,412x
JAM-SDK fib(10)* 25.4 KB 16.2 KB 7,230,603 42 172,157x
Jambrains fib(10)* 61.1 KB - 6,373,683 1 6,373,683x
JADE fib(10)* 67.3 KB 45.7 KB 19,555,955 504 38,801x
aslan-fib accumulate* 37.1 KB 17.6 KB 10,511,413 15,968 658x

*JAM-SDK fib(10), Jambrains fib(10), JADE fib(10), and aslan-fib accumulate exit on unhandled host calls (ecalli). The gas cost reflects program parsing/loading plus partial execution up to the first unhandled ecalli.

Memory layout summary

The JAM blob reserves separate ranges for RO data, a guard gap, globals/overflow metadata, and the WASM heap; see the Architecture docs for the full breakdown, including GLOBAL_MEMORY_BASE, PARAM_OVERFLOW_BASE, SPILLED_LOCALS_BASE, and how wasm_memory_base is computed.

The SPI rw_data section is simply a contiguous copy of every byte from GLOBAL_MEMORY_BASE up to the highest initialized heap address, which is why stub AssemblyScript fixtures such as decoder-test/array-test emit ~13 KB of RW data even though only a handful of bytes are non-zero: the encoder must preserve the absolute addresses of the data segments, so the zero stretch between globals and the first heap byte is encoded verbatim. Keeping globals/data near the heap base or introducing sparse RW descriptors (future work) are the only ways to shrink those blobs without redesigning SPI.

Supported WASM Features

Category Operations
Arithmetic (i32 & i64) add, sub, mul, div_u/s, rem_u/s, all comparisons, clz, ctz, popcnt, rotl, rotr, bitwise ops
Control flow block, loop, if/else, br, br_if, br_table, return, unreachable, block results
Memory load/store (all widths), memory.size, memory.grow, memory.fill, memory.copy, globals, data sections
Functions call, call_indirect (with signature validation), recursion, stack overflow detection
Type conversions wrap, extend_s/u, sign extensions (i32/i64 extend8/16/32_s)
Imports Text-based import maps (--imports) and WAT adapter files (--adapter)

Not supported: floating point (by design — PVM has no FP instructions).

CLI Usage

# Compile WAT or WASM to JAM
wasm-pvm compile input.wat -o output.jam
wasm-pvm compile input.wasm -o output.jam

# With import resolution
wasm-pvm compile input.wasm -o output.jam \
  --imports imports.txt \
  --adapter adapter.wat

# Disable specific optimizations
wasm-pvm compile input.wasm -o output.jam --no-inline --no-peephole

# Disable all optimizations
wasm-pvm compile input.wasm -o output.jam \
  --no-llvm-passes --no-peephole --no-register-cache \
  --no-icmp-fusion --no-shrink-wrap --no-dead-store-elim \
  --no-const-prop --no-inline --no-cross-block-cache \
  --no-register-alloc

See the Import Handling section for details on resolving WASM imports.

Using as a Library

The wasm-pvm crate can be used as a Rust dependency. It supports two modes:

# Full compiler (default) — requires LLVM 18
wasm-pvm = "0.5.2"

# PVM types only — no LLVM dependency, compiles to wasm32-unknown-unknown
wasm-pvm = { version = "0.5.2", default-features = false }

With default-features = false, only the PVM type definitions are available: Instruction, Opcode, ProgramBlob, SpiProgram, abi::*, memory_layout::*, and Error. This is useful for downstream tools that need to work with PVM bytecode (interpreters, debuggers, analyzers) without requiring the full LLVM compiler toolchain.

Feature Default Description
compiler Yes Full WASM-to-PVM compiler (inkwell, wasmparser, wasm-encoder)
test-harness Yes Test utilities for unit testing (implies compiler)

Project Structure

crates/
  wasm-pvm/              # Core library
    src/
      pvm/               # PVM instruction definitions (always available)
      memory_layout.rs   # PVM memory address constants (always available)
      spi.rs             # JAM/SPI format encoder (always available)
      abi.rs             # Register & frame layout constants (always available)
      llvm_frontend/     # WASM → LLVM IR translation (feature = "compiler")
      llvm_backend/      # LLVM IR → PVM bytecode lowering (feature = "compiler")
      translate/         # Compilation orchestration & SPI assembly (feature = "compiler")
  wasm-pvm-cli/          # Command-line interface
tests/                   # 412 integration tests (TypeScript/Bun)
  fixtures/
    wat/                 # WAT test programs
    assembly/            # AssemblyScript examples
    imports/             # Import maps & adapter files
vendor/
  anan-as/               # PVM interpreter (submodule)

Testing

# Rust unit tests
cargo test

# Lint
cargo clippy -- -D warnings

# Integration tests (builds artifacts, then runs all layers)
cd tests && bun run test

# Quick validation (Layer 1 smoke tests only)
cd tests && bun test layer1/

The test suite is organized into layers:

  • Layer 1: Core/smoke tests (~50 tests) — fast, run during development
  • Layer 2: Feature tests (~140 tests)
  • Layer 3: Regression/edge cases (~220 tests)
  • Layer 4-5: PVM-in-PVM tests — the PVM interpreter itself compiled to PVM, running the test suite inside PVM

Import Handling

WASM modules that import external functions need those imports resolved before compilation. Two mechanisms are available:

Import Map (--imports)

A text file mapping import names to simple actions:

# my-imports.txt
abort = trap        # emit unreachable (panic)
console.log = nop   # do nothing, return zero

Adapter WAT (--adapter)

A WAT module whose exports replace matching imports, enabling arbitrary logic for import resolution (pointer conversion, memory reads, host calls):

(module
  (import "env" "host_call_5" (func $host_call_5 (param i64 i64 i64 i64 i64 i64) (result i64)))
  (import "env" "pvm_ptr" (func $pvm_ptr (param i64) (result i64)))

  (func (export "console.log") (param i32)
    (drop (call $host_call_5
      (i64.const 100)                                    ;; ecalli index
      (i64.const 3)                                      ;; log level
      (i64.const 0) (i64.const 0)                        ;; target ptr/len
      (call $pvm_ptr (i64.extend_i32_u (local.get 0)))   ;; message ptr
      (i64.extend_i32_u (i32.load offset=0
        (i32.sub (local.get 0) (i32.const 4)))))))       ;; message len
)

When both --imports and --adapter are provided, the adapter runs first, then the import map handles remaining unresolved imports. All imports must be resolved or compilation fails.

Resources

  • PVM Debugger — upload .jam files for disassembly, step-by-step execution, and register/gas inspection
  • PVM Decompiler — decompile PVM bytecode back to human-readable form
  • ananas (anan-as) — PVM interpreter written in AssemblyScript, compiled to PVM itself for PVM-in-PVM execution
  • as-lan — example AssemblyScript project compiled from WASM to PVM using this tool
  • JAM Gray Paper — the JAM protocol specification (PVM is defined in Appendix A)
  • AssemblyScript — TypeScript-like language that compiles to WASM
  • Documentation Book — full compiler docs (run mdbook serve docs to browse locally)

License

MIT

Contributing

Contributions are welcome! See AGENTS.md for coding guidelines, project conventions, and a map of the codebase.