awkrs 0.4.14

Awk implementation in Rust with broad CLI compatibility, parallel records, and experimental Cranelift JIT
Documentation
  █████╗ ██╗    ██╗██╗  ██╗██████╗ ███████╗
 ██╔══██╗██║    ██║██║ ██╔╝██╔══██╗██╔════╝
 ███████║██║ █╗ ██║█████╔╝ ██████╔╝███████╗
 ██╔══██║██║███╗██║██╔═██╗ ██╔══██╗╚════██║
 ██║  ██║╚███╔███╔╝██║  ██╗██║  ██║███████║
 ╚═╝  ╚═╝ ╚══╝╚══╝ ╚═╝  ╚═╝╚═╝  ╚═╝╚══════╝

CI Crates.io Downloads Docs.rs Docs License: MIT

[WORLDS FASTEST AWK BYTECODE ENGINE // PARALLEL RECORD PROCESSOR // RUST CORE]

"Pattern. Action. Domination."

awkrs runs pattern → action programs over input records like POSIX awk / GNU gawk / mawk, with a fused-superinstruction bytecode VM (plus a default-on fusevm/Cranelift offload path for eligible numeric chunks), parallel record processing, and a CLI that accepts the union of POSIX, gawk, and mawk options.

┌──────────────────────────────────────────────────────────────┐ │ STATUS: ONLINE    THREAT LEVEL: NEON    SIGNAL: ████████░░ │ └──────────────────────────────────────────────────────────────┘

Read the Docs · Engineering Report · strykelang · zshrs · fusevm


Table of Contents


[0x00] SYSTEM SCAN

Positioning: POSIX awk + the gawk extensions that show up in real scripts (BEGINFILE/ENDFILE, coprocess |&, CSV mode, PROCINFO/SYMTAB/FUNCTAB, @include/@load/@namespace, /inet/tcp|udp, MPFR via -M). Performance goal: beat awk/mawk/gawk on supported workloads — see §0x06.

Bytecode cache: -f script.awk runs memoize compiled bytecode to ~/.awkrs/scripts.rkyv — repeat runs skip lex/parse/compile entirely. Source-file mtime and the awkrs-binary mtime invalidate entries silently. AWKRS_CACHE=0 disables it. Details in §0x05.

Implemented gawk-style CLI flags (where they differ from gawk, the gap is documented):

Flag Behavior
-d/--dump-variables Dump globals after run (stdout, -, or file)
-D/--debug Static rule/function listing — not gawk's interactive debugger
-p/--profile Wall-clock summary + per-record-rule hit counts (-j 1 only) — not gawk's per-line profiler
-o/--pretty-print AST pretty-print — not gawk's canonical reformatter
-g/--gen-pot Print and exit before execution
-L/-t/LINT Static lint (extension rules, uninit-var hints, printf format checks); when LINT is truthy at runtime, also emit awkrs: warning: on stderr for sqrt/log domain issues (negative / zero args)
-S/--sandbox Block system(), file redirects, pipes, coprocesses, inet I/O
-l name Load name.awk from AWKPATH (default .)
-b Byte length for length/substr/index
-n strtonum-style hex/octal coercion
-s/--no-optimize Disable peephole/JIT optimization (forces the plain bytecode interpreter)
-c/-P Stored on runtime; minimal effect today
-r/--re-interval Parsed; no runtime effect (regex crate already supports {m,n})
-N/--use-lc-numeric Locale decimal radix and %' grouping in sprintf/printf/print. Does not affect string→number parsing

Gawk parity gaps to know:

  • RS — newline by default; one (UTF-8) char = literal delimiter; RS="" = paragraph mode; multi-char = gawk regex (RT is the matched text). FIELDWIDTHS selects fixed-width when non-empty.
  • PROCINFO — refreshed before and after BEGIN. Includes gawk-style platform (posix/mingw/vms, not Rust's macos/linux), version, ids, errno, api_major/api_minor, argv, identifiers, FS (active split mode), strftime, pgrpid, groupN, mb_cur_max (Linux sysconf), per-input READ_TIMEOUT/RETRY composite keys with fallback chain → global PROCINFO["READ_TIMEOUT"]GAWK_READ_TIMEOUT env. Unix primary record reads poll when a timeout applies. With -M: gmp_version, mpfr_version, prec_min, prec_max. User-set keys persist across the post-BEGIN refresh.
  • PROCINFO["sorted_in"]@ind_*/@val_* modes, plus user comparator function (2-arg = index sort, 4-arg (i1, v1, i2, v2) = value sort). Returns negative/zero/positive like qsort.
  • SYMTAB — assignment, for-in, length(SYMTAB) like gawk's global introspection (not GNU's variable-object references).
  • @load — non-.awk paths only accepted for gawk's bundled extension names (filefuncs, readdir, time, …) as no-ops; the builtins are native. Arbitrary .so/gawkapi modules error at parse time.
  • -M/--bignum — MPFR via rug (default 256 bits, PROCINFO["prec"]/["roundmode"] apply). Arithmetic, sprintf/printf integer formats (no f64/i64 clamp), int/intdiv/strtonum/++/--, bit ops, transcendentals, srand (low 32 bits of previous seed), CONVFMT/OFMT/%s/concat/regex coercion all use MPFR. Default CONVFMT-style number→string for scalars uses each Float's own precision for the MPFR sprintf path (so raising PROCINFO["prec"] is not undermined by a hardcoded bit count at display time). JIT is disabled in -M mode.
  • Unicode vs bytes: -b honored for length/substr/index. Full multibyte field-splitting parity is not audited.

HELP // SYSTEM INTERFACE


[0x01] SYSTEM REQUIREMENTS

  • Rust toolchain (rustc + cargo)
  • A C compiler and make for gmp-mpfr-sys (pulled in by rug for -M); typical macOS/Linux setups already satisfy this.

[0x02] INSTALLATION

brew tap MenkeTechnologies/menketech                   # one-time
brew install awkrs                                     # via Homebrew tap

cargo install awkrs                                    # from crates.io

git clone https://github.com/MenkeTechnologies/awkrs   # from source
cd awkrs && cargo build --release

awkrs on Crates.io

Zsh completion:

fpath=(/path/to/awkrs/completions $fpath)
autoload -Uz compinit && compinit

[0x03] LANGUAGE COVERAGE

Compatibility matrix: BSD awk, mawk, and gawk vs awkrs.

┌──────────────────────────────────────────────────────────────┐ │ SUBSYSTEM: LEXER ████ PARSER ████ COMPILER ████ VM ████ │ └──────────────────────────────────────────────────────────────┘

  • Rules: BEGIN, END, BEGINFILE/ENDFILE, empty pattern, /regex/, expression patterns, range patterns (/a/,/b/ or NR==1,NR==5). Like gawk, the four special patterns must use { … }; record rules may omit braces for the default { print $0 }.
  • Statements: if/while/do…while/for (C-style and for (i in arr)), switch/case/default (gawk-style: no fall-through, regex case /re/), print/printf (with >, >>, |, |& redirection), break, continue, next, nextfile, exit, delete, return, getline (primary, < file, <& cmd, expr | getline [var]).
  • getline as expression: value 1 (read), 0 (EOF), -1 (error), -2 (gawk retryable I/O when PROCINFO[input,"RETRY"] is set).
  • Operators: arithmetic, comparison, string concat, ternary, in, ~/!~, ++/-- (prefix/postfix on vars, $n, a[k]), ^/** (right-associative; unary +/-/! bind looser, so -2^2 = -(2^2)). Division by zero (/, compound /=) is a fatal error (gawk-style: division by zero attempted), not infinity.
  • Primary /: The lexer may emit / as division when regex_mode is false (e.g. after =). At a primary position / cannot start division (division is a binary operator), so the parser re-reads it as /regex/. In expression context a bare /re/ means $0 ~ /re/ (POSIX), so e.g. !/foo/, if (/foo/), and x = /foo/ use a match against the current record, not a string literal. The RHS of ~ / !~ and regexp arguments to gsub/sub/gensub/match/split/patsplit still treat /re/ as the pattern only (so b/c in a replacement string stays division).
  • gawk regexp constants: @/pattern/ yields a regexp value (typeof reports regexp); ~ uses the pattern as a regex.
  • Data: fields, scalars, associative arrays (a[k], a[i,j] with SUBSEP), ARGC/ARGV (set before BEGIN; ARGV[0] is the executable, ARGV[1..] are file paths). FS (regex when multi-char), FPAT (gawk-style: non-empty splits by regex match), split/patsplit (3rd arg accepts regex; patsplit 4-arg form populates seps). POSIX record model: NF = n truncates or extends fields and rebuilds $0 with OFS; $0 = "…" re-splits and updates NF. FS/FPAT from literals: bytecode may store source "…" as an internal literal string, but cached_fs sync on read still tracks those assignments. Whole array in a scalar context (print a, string concat, printf args, ~ operands, etc.) is a fatal runtime error (gawk-style), not a silent empty string. Scalars: uninitialized variables compare like numeric 0 where POSIX expects dual 0/"". String constants vs input: program string literals are not numeric strings for </<=/>/>= the way $n can be; arithmetic still uses longest-prefix string→number ("3.14abc"+03.14). split("", arr, fs) returns 0 (no empty pseudo-field).
  • Records & env: RS/RT as documented above. ENVIRON, CONVFMT, OFMT, FIELDWIDTHS, IGNORECASE (case-insensitive regex + ==/!=/ordering via strcoll), ARGIND, ERRNO, LINT, TEXTDOMAIN, BINMODE. PROCINFO/FUNCTAB/SYMTAB as in §0x00.
  • CLI extensions: -k/--csv enables CSV mode (RFC-style quoting, "" escape) — sets FS/FPAT and uses a dedicated parser aligned with gawk --csv.
  • Builtins: length, index (empty needle → 1, matching gawk), substr (gawk rule, not POSIX: if start < 1, clamp to 1 and leave length unchanged — POSIX shortens length by 1 - start), intdiv, mkbool, split, sprintf/printf (flags, * and %n$ positional, gawk %', conversions %s %d %i %u %o %x %X %f %e %E %g %G %c %%%e/%E use signed two-digit exponents; %c uses a string’s first character), gsub/sub/match, gensub, isarray, tolower/toupper, int, math (sin cos atan2 exp log sqrt), rand/srand, systime, strftime (0–3 args), mktime, system, close, fflush, bit ops (and or xor lshift rshift compl), strtonum, asort/asorti. User-defined function with parameter locals.
  • Static checks: Before bytecode emission, the compiler rejects parenthesized comma lists except in print/printf arguments and (… ) in arr keys (e.g. (1,2) alone as a statement is an error). gsub/sub require at least two arguments; split/match at least two; patsplit two to four; gensub three or four — otherwise a clear runtime-style error is returned at compile load time. User-defined recursion is capped (256 nested calls in release builds; lower in unit tests) so pathological self-calls fail with an error instead of overflowing the host stack.
  • Expressions: integer literals use gawk rules in source — 0x/0X hex; leading 0 octal when all digits are 07 (otherwise decimal, e.g. 01238 → 1238); floats with a . use a decimal integer part (077.5 → 77.5). Multidimensional membership (i,j) in arr uses a parenthesized comma list (gawk); it may appear alone as a print argument to emit several fields.
  • I/O model: main record loop and unredirected getline share one BufReader so line order matches POSIX. exit from BEGIN or a pattern action still runs END rules, then exits with the requested code.
  • Locale & pipes: Unix string compare/order uses strcoll (LC_COLLATE/LC_ALL). |& and <& run under sh -c (mixing | and |& on the same command is an error). With -N, LC_NUMERIC applies to sprintf/printf floats and %' grouping; without -N, %' still uses localeconv()'s thousands separator (fallback ,). -N does not affect parsing of numeric strings from input.
  • Gawk extras: @include, @load "*.awk", @namespace "…" (default identifier prefixing; built-ins exempt), indirect calls (@name(…) / @(expr)(…)), /inet/tcp/… and /inet/udp/… client sockets, gettext builtins (bindtextdomain, dcgettext, dcngettext with .mo catalogs via the gettext crate), -M/--bignum MPFR.

[0x04] MULTITHREADING // PARALLEL EXECUTION GRID

 ┌─────────────────────────────────────────────┐
 │  WORKER 0  ▓▓  CHUNK 0   ██ REORDER QUEUE  │
 │  WORKER 1  ▓▓  CHUNK 1   ██ ──────────────>│
 │  WORKER 2  ▓▓  CHUNK 2   ██  DETERMINISTIC │
 │  WORKER N  ▓▓  CHUNK N   ██  OUTPUT STREAM  │
 └─────────────────────────────────────────────┘

Default -j/--threads is 1. Pass a higher value when the program is parallel-safe (static check: no range patterns, no exit/nextfile/delete, no primary getline, no pipe/coproc getline, no asort/asorti, no indirect calls, no print/printf redirection, no cross-record assignments). Records are processed in parallel via rayon and output is reordered to input order within each batch so pipelines stay deterministic.

Regular files are memory-mapped (memmap2) and scanned with the same RS rules as the sequential path — no read() copy of the whole file. Stdin parallel chunks up to --read-ahead lines (default 1024) per batch, dispatches to workers, emits in order, then refills.

Workers run the same bytecode VM as the sequential path. The compiled program is shared via Arc<CompiledProgram> (one compile, cheap refcount per worker) with per-worker runtime state.

Fallback: non-parallel-safe programs run sequentially with a warning when -j > 1. Programs that use primary getline (including in BEGIN) also run sequentially for file input. END only sees post-BEGIN global state — record-rule mutations from parallel workers are not merged.


[0x05] BYTECODE VM // EXECUTION CORE

┌──────────────────────────────────────────────────────────────┐ │ ARCHITECTURE: STACK VM    OPTIMIZATION: PEEPHOLE FUSED │ └──────────────────────────────────────────────────────────────┘

awkrs compiles AWK programs into a flat bytecode instruction stream and runs them on a stack VM. Short-circuit &&/||, control flow, and range patterns resolve to jump-patched offsets at compile time. The string pool interns variable names and string constants for cheap u32 indexing.

fusevm offload (on by default): eligible numeric bytecode chunks are lowered to fusevm — the shared bytecode VM also used by zshrs and strykelang — and run on fusevm::VM. Set AWKRS_FUSEVM=0 to force the bytecode interpreter for every chunk. src/fusevm_bridge.rs translates an eligible chunk (is_fusevm_eligible) into a fusevm::Chunk (build_numeric_chunk), runs it, and writes modified slots back into the awkrs runtime. Each per-record chunk is a stable chunk (no baked-in seed preamble); the accumulator's prior value is seeded as data into the VM's base frame before run(), so the chunk — and therefore its op-hash — is identical across every record, which is what lets the JIT-compiled native code be reused across records and across processes. int(x) is admitted into the numeric chunk and lowers to the native fusevm::Op::AwkInt (Cranelift trunc) — the first AWK builtin reachable end-to-end from awkrs into fusevm's native AWK-op set and JIT. The transcendental math builtins sin/cos/exp/atan2 are likewise admitted, lowering to native fusevm::Op::AwkSin/AwkCos/AwkExp/AwkAtan2 — Cranelift libcalls to small Rust helpers that canonicalize a NaN result to +nan, matching awkrs/gawk's NaN-sign normalization. The gawk bitwise builtins and/or/xor (variadic, ≥2 args) are admitted too, lowering to native fusevm::Op::AwkAnd/AwkOr/AwkXor (Cranelift band/bor/bxor over a saturating f64i64 conversion that matches awkrs's num_to_u64 for huge/NaN operands). sqrt/log stay interpreter-side because they emit a host stderr warning on a negative argument that a pure native op cannot reproduce; lshift/rshift/compl stay interpreter-side because they raise a fatal on a negative argument (a value-dependent trap), unlike the trap-free and/or/xor. Eager block-JIT compilation for offloaded loops: a BEGIN/END loop chunk offloaded via run_fusevm_region calls fusevm::VM::run() exactly once, so under fusevm's normal warm-up threshold (compile after N invocations) the block JIT would never compile it and the whole loop would run on fusevm's slower interpreter. Because eligible_loop_prefix only selects a region that contains a backward jump (a genuinely hot loop), the bridge forces the block-JIT threshold to 0 (compile on the first invocation) around that single run(), so the loop compiles to native code immediately. Measured: a 20M-iteration s += sin(i) loop dropped from ~12.8s (interpreter) to ~0.15s, and an x = and(x, i) + 1 loop from ~14.0s to ~0.13s — ~85–110× — both bit-for-bit identical to AWKRS_FUSEVM=0. Division and modulo are offloaded and block-JIT-compiled: a chunk containing /, %, or compound /=/%= lowers to the native trapping ops fusevm::Op::AwkDivJit/AwkModJit, which the block JIT compiles with a guarded zero-divisor early-exit (fcmp eq divisor, 0.0 → a trap libcall that raises the fatal division by zero attempted / …in %'for modulo, elsefdiv/fmod). So a hot division loop runs as native code — a 20M-iteration division loop dropped from ~8.0s (interpreter) to ~0.12s, ~68× — while a zero divisor reached inside the compiled loop still raises the POSIX fatal instead of producing inf/NaNor hanging. The trap libcall is not a registered host-helper id, soAwkDivJit/AwkModJitchunks JIT in-process only and skip the on-disk cache (no schema impact). fusevm also still defines the interpreter-only trappingOp::AwkDiv/AwkMod(distinct from its shell-arithmeticOp::Div/Op::Modused byzshrs/strykelang); awkrs emits the *Jit` variants.

fusevm persistent JIT cache (~/.cache/fusevm-jit): awkrs builds fusevm with the jit-disk-cache feature (see Cargo.toml), so when the fusevm path is active fusevm's Cranelift tiers persist native-compiled code to ~/.cache/fusevm-jit and reload it across processes — the same on-disk JIT cache zshrs uses. Two tiers write there: the block JIT compiles a fully-eligible per-record numeric chunk to native code (*.blk.fjit), and the tracing JIT compiles hot in-chunk loop traces (*.trc.fjit). This is distinct from awkrs's own ~/.awkrs/scripts.rkyv bytecode cache (below): one caches fusevm-emitted machine code keyed by chunk op-hash (and slot-kind hash), the other caches awkrs bytecode keyed by source. Override the directory with FUSEVM_JIT_CACHE_DIR (off disables); cap it with FUSEVM_JIT_CACHE_MAX_BYTES. fusevm's block JIT round-trips awk's f64 slots through its SlotKind::Float bit-pattern model (a slot is an i64 holding the raw f64 bits; GetSlot/SetSlot bitcast through f64), so an x = int(x + c) accumulator now block-JIT-compiles natively and caches a *.blk.fjit artifact reused on the next run — verified producing the artifact and matching the bytecode interpreter bit-for-bit. Coverage is still partial — the tracing JIT does not yet hot-trace every top-tested for/while shape, and only the numeric chunk set below is eligible. So the cache engages for the chunks fusevm can compile and is inert for the rest. The offload is on by default (AWKRS_FUSEVM=0 disables it); with the block JIT's f64-slot support (below) and eager compilation of offloaded loop regions (above) it is a large win on JIT-compilable numeric loops — measured 14–110× over the bytecode interpreter for pure-arithmetic and builtin (sin/and/…) accumulator loops — while staying bit-for-bit faithful to the bytecode interpreter, and ineligible chunks run on the interpreter exactly as before. The default execution path for ineligible chunks is the bytecode interpreter with peephole-fused superinstructions; AWKRS_JIT=0 (or -s/--no-optimize) disables peephole optimization too.

fusevm bridge per-record caches (in-process): the persistent on-disk cache above caches compiled native code keyed by fusevm-op-hash; the bridge ALSO maintains in-process caches on Runtime that catch the upstream work the disk cache can't touch. (1) fuse_chunk_cache: HashMap<(chunk_ptr, bignum), Option<Arc<(fusevm::Chunk, Vec<u16>)>>> caches the built fusevm::Chunk per awkrs Chunk so build_numeric_chunk's eligibility check + 2-pass op→fusevm translation only runs once per (chunk, bignum), not per record. (2) fuse_last_chunk_key/fuse_last_chunk_value form a single-slot side-table that hoists the HashMap lookup out of the per-record path entirely for the common single-rule case (one tuple compare + Arc::clone, no HashMap touch). (3) fuse_vm_pool: fusevm::VMPool recycles fusevm::VM instances across records — VM::reset(chunk) preserves Vec capacities (stack, frames, slot_buf, globals) so subsequent records reuse the underlying allocations. (4) The cache value's Vec<u16> is the precomputed write-slot set — per-record writeback only walks Op::SetSlot targets instead of all N runtime slots. (5) Slot seeding goes direct from ctx.rt.slots into vm.set_slot with no intermediate Vec<f64> allocation. Cumulative win on the awkrs JIT path: 11% on count_gt_5m over 10M records, 6% on compound_pred (release best-of-3). The JIT path is still net-slower than the awkrs interpreter on tight one-op-body micro-benches — the remaining gap is fusevm's per-call VM setup (chunk move into the VM, JIT entry overhead) which would need fusevm-side API changes (Arc-aware VM::reset) to cut further.

Eligible chunks (fusevm offload): pure-numeric bodies — constants, arithmetic/comparisons, ++/--, compound assignments (+=/-=/*=//=/%=/^=), division/modulo (//%, native trapping Op::AwkDivJit/AwkModJit with a guarded zero-divisor early-exit), jumps and fused loop tests, scalar slot reads/writes whose values stay numeric, int(x) (native Op::AwkInt), the transcendentals sin/cos/exp/atan2 (native Op::AwkSin/AwkCos/AwkExp/AwkAtan2, NaN→+nan), and the bitwise builtins and/or/xor (native Op::AwkAnd/AwkOr/AwkXor, saturating f64i64). When such a chunk is a hot loop offloaded as a region, the block JIT is eager-compiled on the first run() (see above) so the native code is used immediately rather than after a warm-up. Chunks that touch strings, fields, arrays, regexes, getline, print, user calls, or other builtins (including sqrt/log, which warn on negative args, and lshift/rshift/compl, which raise a fatal on negative args) are ineligible. -M/--bignum disables the path entirely (MPFR values can't be represented as f64 slots). Consulted by default (AWKRS_FUSEVM=0 forces everything onto the bytecode interpreter).

Peephole fusion combines common sequences into single opcodes:

  • print $NPrintFieldStdout (zero-alloc field write)
  • s += $NAddFieldToSlot (in-place numeric parse)
  • i = i + 1 / i++ / ++iIncrSlot (one numeric add, no stack traffic)
  • s += i between slots → AddSlotToSlot
  • $1 "," $2 literal concat → ConcatPoolStr
  • NR++ HashMap-path → IncrVar

Inline fast paths bypass VmCtx entirely for single-rule programs with one fused opcode ({ print $1 }, { s += $1 }). Memory-mapped files also recognize { gsub("lit", "repl"); print } with literal pattern: when the needle is absent, the loop writes each line from the mapped buffer with ORS and skips the VM.

Bytecode cache: -f script.awk invocations memoize the compiled CompiledProgram to ~/.awkrs/scripts.rkyv — an rkyv-archived shard with mmap + zero-copy ArchivedHashMap lookup on the read path (check_archived_root validation) and flock-serialized atomic-rename writes. Each entry's inner CompiledProgram blob is bincode (rkyv outer, bincode inner — same architecture as zshrs/stryke). Repeat runs skip lex/parse/compile entirely — only the matched entry's blob is decoded. Entries are invalidated on source-file mtime change or when the running awkrs binary is newer than the cached entry (any rebuild silently rebuilds the cache). Disable with AWKRS_CACHE=0. The cache only engages for the simple -f script.awk form — inline -e/--source, -E, -i/--include, -l/--load, --debug, --lint, --pretty-print, and --gen-pot skip the cache because they need the AST.

Based on a survey of the major public awk implementations (BWK awk, gawk, mawk, goawk, frawk, zawk), awkrs appears to be the first awk implementation to pair a bytecode VM with a persistent on-disk bytecode cache. frawk is the closest prior art on JIT — it has VM + Cranelift/LLVM JIT — but re-compiles on every invocation; its overview and README contain no mention of disk-persisted compiled artifacts. gawk's pm-gawk persists script-defined variables and functions across runs, not compiled bytecode — different feature. (awkrs's own fusevm/Cranelift offload is on by default for eligible numeric chunks and, with eager block-JIT compilation of offloaded loop regions, is now a large win on those loops — 14–110× over the bytecode interpreter — though it does not engage for the string/field-dominated programs most awk workloads run, so it is not claimed as the headline feature here; note the fusevm offload additionally carries its own separate persistent machine-code cache at ~/.cache/fusevm-jit, distinct from the bytecode cache claimed here.)

Implementation Bytecode VM JIT Persistent bytecode cache
BWK awk (one-true-awk) ✗ tree-walker
gawk ✗ (pm-gawk is for vars)
mawk
goawk
frawk ✓ Cranelift + LLVM
zawk (frawk fork) ✓ Cranelift + LLVM
awkrs ◐ fusevm/Cranelift (on by default; block + tracing tiers, ~/.cache/fusevm-jit)

Raw byte field extraction: print $N with default FS scans raw bytes in the mapped file buffer to find the Nth whitespace field, writes it to the output buffer, and appends Runtime::ors_bytes — no record copy, no UTF-8 validation.

Other optimizations:

  • Indexed slots: scalars get u16 slot indices; reads/writes are flat-array indexing instead of HashMap lookups (specials like NR/FS/OFS and array names stay on the HashMap path).
  • Zero-copy fields: fields stored as (u32, u32) byte ranges into the record string; owned Strings only on set_field.
  • Direct-to-buffer print: stdout writes go straight into a 64 KB Vec<u8> (flushed at file boundaries) — no per-record String, format!(), or stdout locking.
  • Cached separators: OFS/ORS bytes cached on the runtime, updated only on assignment. The direct-to-buffer stdout print path uses the full ofs_bytes/ors_bytes slices (arbitrary length; not capped at 64 bytes).
  • Byte-level input: read_until(b'\n') into a reusable Vec<u8> skips per-line UTF-8 validation.
  • Regex cache: compiled Regex objects cached in a HashMap<String, Regex>.
  • sub/gsub: when target is $0, applies the new record in one step. Literal needles reuse a cached memmem::Finder. Constant string operands pass via Cow (no per-call alloc).
  • parse_number: fast-paths plain decimal integer field text before falling back to str::parse::<f64>().
  • Slurped input: newline scanning uses memchr.
  • Parallel: compiled program shared via Arc across rayon workers (zero-copy).

[0x06] BENCHMARKS // COMBAT METRICS (vs awk / gawk / mawk)

┌──────────────────────────────────────────────────────────────┐ │ HARDWARE: APPLE M5 MAX    OS: macOS    ARCH: arm64 │ └──────────────────────────────────────────────────────────────┘

Measured with hyperfine. BSD awk (/usr/bin/awk), GNU gawk 5.4.0, mawk 1.3.4, awkrs (see Cargo.toml for current version). Relative = mean ÷ fastest mean in that table. awkrs has two rows: default (JIT attempted) vs AWKRS_JIT=0 (bytecode only). Each table is one hyperfine invocation across all five commands on the same 1 M-line input, generated 2026-04-10 UTC by ./scripts/benchmark-vs-awk.sh and copied verbatim from benchmarks/benchmark-results.md. For the awkrs-only JIT-vs-bytecode A/B see benchmarks/benchmark-readme-jit.md.

Caveat (2026-06-01): the awkrs (JIT) rows below were generated against awkrs's former in-tree Cranelift module (src/jit.rs), which has since been removed; JIT now means the fusevm/Cranelift offload (on by default; AWKRS_FUSEVM=0 disables it), which does not engage for these string/field programs anyway (they are ineligible). The default path for such programs is the fused-superinstruction bytecode interpreter — effectively the awkrs (bytecode) row. Re-run ./scripts/benchmark-vs-awk.sh to regenerate.

1. Throughput: { print $1 } over 1 M lines

Command Mean Min Max Relative
BSD awk 195.0 ms 179.8 ms 221.6 ms 12.43×
gawk 100.8 ms 92.8 ms 115.8 ms 6.42×
mawk 66.2 ms 61.9 ms 78.4 ms 4.22×
awkrs (JIT) 15.7 ms 13.3 ms 19.6 ms 1.00×
awkrs (bytecode) 16.1 ms 13.1 ms 20.2 ms 1.03×

2. CPU-bound BEGIN (no input)

BEGIN { s = 0; for (i = 1; i < 400001; i = i + 1) s += i; print s }

Command Mean Min Max Relative
BSD awk 15.8 ms 14.0 ms 18.6 ms 1.71×
gawk 20.7 ms 18.8 ms 22.9 ms 2.24×
mawk 9.7 ms 8.3 ms 11.4 ms 1.06×
awkrs (JIT) 9.2 ms 8.4 ms 12.0 ms 1.00×
awkrs (bytecode) 9.6 ms 8.2 ms 12.0 ms 1.04×

3. Sum first column ({ s += $1 } END { print s }, 1 M lines)

Cross-record state is not parallel-safe, so awkrs stays single-threaded here. On regular-file input, awkrs uses a raw byte path: parses the Nth whitespace field directly from the mmap'd buffer.

Command Mean Min Max Relative
BSD awk 158.5 ms 147.0 ms 172.7 ms 12.27×
gawk 62.9 ms 58.4 ms 68.9 ms 4.87×
mawk 37.5 ms 33.7 ms 39.9 ms 2.90×
awkrs (JIT) 13.0 ms 11.9 ms 15.4 ms 1.01×
awkrs (bytecode) 12.9 ms 11.5 ms 16.1 ms 1.00×

4. Multi-field print ({ print $1, $3, $5 }, 1 M lines, 5 fields/line)

Command Mean Min Max Relative
BSD awk 647.6 ms 623.5 ms 686.3 ms 11.60×
gawk 266.1 ms 257.4 ms 301.8 ms 4.77×
mawk 156.6 ms 149.8 ms 170.7 ms 2.81×
awkrs (JIT) 56.4 ms 53.1 ms 61.8 ms 1.01×
awkrs (bytecode) 55.8 ms 53.4 ms 61.6 ms 1.00×

5. Regex filter (/alpha/ { c += 1 } END { print c }, 1 M lines, no matches)

Command Mean Min Max Relative
BSD awk 191.8 ms 180.1 ms 208.9 ms 17.31×
gawk 351.4 ms 342.7 ms 363.3 ms 31.72×
mawk 19.3 ms 17.5 ms 21.8 ms 1.74×
awkrs (JIT) 11.1 ms 9.5 ms 13.5 ms 1.00×
awkrs (bytecode) 11.1 ms 9.5 ms 14.6 ms 1.00×

6. Associative array ({ a[$5] += 1 } END { for (k in a) print k, a[k] }, 1 M lines)

Command Mean Min Max Relative
BSD awk 826.2 ms 792.2 ms 896.0 ms 2.43×
gawk 342.4 ms 330.6 ms 362.5 ms 1.01×
mawk 610.0 ms 588.9 ms 648.7 ms 1.79×
awkrs (JIT) 340.0 ms 324.2 ms 377.7 ms 1.00×
awkrs (bytecode) 343.7 ms 323.5 ms 356.7 ms 1.01×

7. Conditional field (NR % 2 == 0 { print $2 }, 1 M lines, 2 fields/line)

Command Mean Min Max Relative
BSD awk 289.1 ms 263.1 ms 321.1 ms 9.58×
gawk 116.1 ms 111.0 ms 124.4 ms 3.85×
mawk 71.1 ms 66.9 ms 83.6 ms 2.36×
awkrs (JIT) 30.2 ms 28.1 ms 34.0 ms 1.00×
awkrs (bytecode) 30.7 ms 28.0 ms 35.5 ms 1.02×

8. Field computation ({ sum += $1 * $2 } END { print sum }, 1 M lines, 2 fields/line)

On regular-file input with default FS, awkrs extracts both fields in a single byte scan and parses them as numbers directly from the mmap'd buffer.

Command Mean Min Max Relative
BSD awk 261.8 ms 251.4 ms 280.8 ms 13.96×
gawk 100.5 ms 95.3 ms 109.5 ms 5.36×
mawk 57.7 ms 54.5 ms 61.1 ms 3.08×
awkrs (JIT) 19.0 ms 17.6 ms 23.0 ms 1.01×
awkrs (bytecode) 18.8 ms 17.5 ms 22.8 ms 1.00×

9. String concat print ({ print $3 "-" $5 }, 1 M lines, 5 fields/line)

Command Mean Min Max Relative
BSD awk 640.8 ms 611.9 ms 689.3 ms 12.68×
gawk 182.2 ms 168.1 ms 197.2 ms 3.61×
mawk 121.0 ms 113.6 ms 128.1 ms 2.39×
awkrs (JIT) 51.0 ms 49.2 ms 53.8 ms 1.01×
awkrs (bytecode) 50.5 ms 48.8 ms 54.8 ms 1.00×

10. gsub ({ gsub("alpha", "ALPHA"); print }, 1 M lines, no matches)

Lines do not contain alpha, so this measures no-match gsub plus print. On regular-file input, awkrs uses a slurp inline path: byte memmem scan + print with no VM or per-line set_field_sep_split when the literal is absent.

Command Mean Min Max Relative
BSD awk 291.5 ms 282.3 ms 300.4 ms 21.15×
gawk 436.3 ms 425.7 ms 459.3 ms 31.66×
mawk 74.3 ms 68.8 ms 84.2 ms 5.39×
awkrs (JIT) 13.8 ms 12.8 ms 16.2 ms 1.00×
awkrs (bytecode) 13.9 ms 12.7 ms 17.6 ms 1.01×
./scripts/benchmark-vs-awk.sh                              # cross-engine §1–§10 (1 M lines)
AWKRS_BENCH_LINES=5000000 ./scripts/benchmark-vs-awk.sh    # 5 M line sweep
./scripts/benchmark-readme-jit-vs-vm.sh                    # awkrs-only JIT vs bytecode A/B

Demo scripts

Quick tours over the feature surface — runnable against the debug build (cargo build); each script auto-builds if the binary is missing.

./scripts/demo-quickstart.sh        # 16-section feature tour (CSV, FIELDWIDTHS, FPAT, RS regex,
                                    # gensub, match() capture, pipes, getline, parallel, JIT toggle)
./scripts/demo-log-parsing.sh       # applied: access log → status dist, CSV → per-host stats,
                                    # ps → top RSS, /etc/passwd → shell histogram
LINES=200000 ./scripts/demo-parallel-jit.sh   # /usr/bin/time -p sweep of -j 1 vs -j N and
                                              # JIT default vs AWKRS_JIT=0 bytecode

NO_COLOR=1 strips ANSI from the demo output. The applied demo writes its inputs to $TMPDIR and cleans up on exit.

Deep examples (examples/*.awk)

Substantive standalone awk programs that exercise recursion, multidim arrays via SUBSEP, PROCINFO["sorted_in"], three-arg match(), asort, and bit ops. Each <name>.awk ships with <name>.in (stdin input). Every example is byte-for-byte verified against gawk by the parity job in CI (bash parity/run_parity.sh gawk), so they double as gawk-extension regression tests.

File What it shows
bst.awk BST insert + inorder / preorder / postorder traversals via recursion
heap_sort.awk Min-heap push / pop → heapsort
trie.awk Trie membership + prefix-count using SUBSEP two-level keys
levenshtein.awk O(la·lb) edit-distance DP table held in a SUBSEP 2D array
calc_rd.awk Recursive-descent arithmetic parser (+ - * / % ^ unary ( ), right-assoc ^)
topo_sort.awk Kahn's algorithm topological sort + cycle detection
brainfuck.awk Brainfuck interpreter (precomputed bracket map, modulo-256 cells)
rpn.awk Postfix calculator (dup / swap / drop / neg + arith) on an explicit stack
json_pretty.awk JSON tokeniser → 2-space indented pretty-printer
hexdump.awk xxd-style hex + ASCII dump (16-byte rows with offset, hex, gutter, ascii)
csv_pivot.awk CSV → per-group min / max / mean / total aggregation
graph_bfs.awk Undirected BFS from a source + path reconstruction
markov.awk Bigram model → top-3 continuations + deterministic 12-step walk
sql_like.awk Mini-SQL on CSV: SELECT … WHERE … GROUP BY … SUM/AVG/COUNT … ORDER BY …
sudoku.awk 9×9 Sudoku solver via recursive backtracking (row/col/box witness sets)
regex_engine.awk Recursive regex matcher (. * + ? ^ $ [...] \) written in awk
dijkstra.awk Single-source shortest paths with a binary min-heap PQ
kruskal.awk MST via union-find (path halving + union-by-rank)
maze_bfs.awk BFS shortest path through an ASCII maze; path overlaid with *
diff_lcs.awk LCS-based unified diff with recursive traceback
n_queens.awk N-queens backtracking with col / diag1 / diag2 constant-time witnesses
kmp.awk Knuth-Morris-Pratt substring search (failure function + scan)
intervals.awk Merge overlapping closed intervals; report total covered length
roman.awk Roman numerals ↔ integers, subtractive form, range 1..3999
knapsack.awk 0/1 knapsack DP table + traceback to recover chosen items
prime_sieve.awk Sieve of Eratosthenes up to N + ten-per-row pretty printing
floyd_warshall.awk All-pairs shortest paths (negative weights allowed)
bellman_ford.awk Single-source shortest paths + negative-cycle detection
scc_tarjan.awk Tarjan's strongly connected components (recursion + lowlink)
conway.awk Conway's Game of Life — fixed-grid evolution for N generations
a_star.awk A* on an ASCII grid (Manhattan heuristic, tuple-keyed min-heap PQ)
base64.awk RFC-4648 base64 encode + decode (no external tools)
base_conv.awk Integer base conversion 2..36 in either direction
rle.awk Run-length encode + decode (whitespace preserved)
vigenere.awk Vigenère cipher (encrypt + decrypt; case preserved)
bigint_mul.awk Arbitrary-precision multiplication via schoolbook on digit arrays
lru_cache.awk LRU cache: O(1) get + put via doubly-linked list + hash map
segment_tree.awk Iterative segment tree (point update + range-sum query)
fenwick.awk Fenwick / Binary Indexed Tree (prefix + range sums)
convex_hull.awk Convex hull via Andrew's monotone chain + shoelace area
lis.awk Longest increasing subsequence in O(n log n) with traceback
huffman.awk Huffman coding — build prefix tree, encode + round-trip decode
manacher.awk Manacher's longest palindromic substring in O(n)
subset_sum.awk Subset-sum DP + reconstruction of one valid subset
permutations.awk Heap's algorithm — enumerate all n! permutations
tsp_dp.awk Held-Karp bitmask DP for travelling salesman (≤15 cities)
anagrams.awk Group anagrams by sorted-letter signature
rule30.awk Wolfram Rule 30 elementary 1D cellular automaton
aho_corasick.awk Aho-Corasick multi-pattern search (goto/fail/output trie)
z_function.awk Z-array — linear-time prefix-match table + pattern search
rabin_karp.awk Rolling-hash substring search (Rabin-Karp)
shunting_yard.awk Dijkstra's shunting-yard infix → postfix → evaluate
modexp.awk Modular exponentiation + deterministic Miller-Rabin primality
gcd_extended.awk Extended Euclidean algorithm + modular inverse
mandelbrot.awk ASCII Mandelbrot escape-time render
ini_parser.awk INI config parser with sections / comments / globals
url_parser.awk URL decomposer (scheme / user / pass / host / port / path / query / fragment)
tictactoe.awk Minimax tic-tac-toe solver — best move + outcome from any position
coin_change.awk Min-coins DP + reconstruction of one optimal combination
prim_mst.awk Prim's MST via lazy linear frontier scan
boyer_moore.awk Boyer-Moore substring search (bad-character heuristic)
suffix_array.awk Suffix array + LCP (lex-sort-based) for each input line
avl_tree.awk AVL self-balancing BST with insert, inorder, height, balance-check
quickselect.awk Kth smallest element via Hoare partitioning
horner.awk Horner's method: evaluate, derivative, synthetic division
pollard_rho.awk Pollard's rho integer factorization + Miller-Rabin
lzw.awk LZW compression encode + decode (256-byte dictionary start)
markdown_basic.awk Markdown → HTML for a subset (headings, lists, code, links)
date_calc.awk Day-of-week (Zeller), calendar generator, date difference
gauss_elim.awk Gaussian elimination with partial pivoting (Ax = b)
twenty48.awk 2048 board: apply L/R/U/D moves, merge tiles, track score
email_extract.awk Find emails in free text + sorted unique tally

Run any example directly:

./target/release/awkrs -f examples/calc_rd.awk <examples/calc_rd.in
./target/release/awkrs -f examples/brainfuck.awk <examples/brainfuck.in

[0x07] BUILD // COMPILE THE PAYLOAD

cargo build --release

awkrs --help / -h prints a cyberpunk HUD (ASCII banner, status box, taglines, footer) in the style of MenkeTechnologies tp -h. ANSI colors apply when stdout is a TTY; set NO_COLOR to force plain text.

Regenerate the help screenshot after UI changes: ./scripts/gen-help-screenshot.sh (needs termshot on PATH and a prior cargo build). The capture runs on a PTY with NO_COLOR unset and renders at 256 columns.


[0x08] TEST // INTEGRITY VERIFICATION

cargo test

CI runs on pushes and pull requests to main via GitHub Actions: one Ubuntu lint job (cargo fmt --check, cargo clippy -D warnings, cargo doc with RUSTDOCFLAGS=-D warnings) plus a build/test matrix on Ubuntu and macOS.

Coverage spans library unit tests for every module (lexer, parser, format, builtins, interp, vm, jit, compiler, runtime, locale, cli, cyber_help) and integration suites under tests/ that exercise the gawk-style additions, the slurped-input path, parallel record behavior, and the full CLI surface. Cross-feature combinations (CSV + ENDFILE, paragraph RS="" + getline, FIELDWIDTHS + NF reassignment, ...) live in tests/cross_feature_integration.rs.


[0x09] DOCUMENTATION // RENDERED HTML + MARKDOWN

docs/ is published to GitHub Pages on every push to main and is the authoritative source for the rendered reference + engineering report.

Doc Source Live URL
User reference (quickstart, builtins, variables, examples, cache + parallel notes) docs/index.html https://menketechnologies.github.io/awkrs/
Engineering report (architecture, module table, perf stack, divergence ledger, competitive matrix) docs/report.html https://menketechnologies.github.io/awkrs/report.html
Compatibility matrix vs BSD awk / mawk / gawk docs/COMPATIBILITY.md renders on GitHub
Benchmarks vs BSD awk / mawk / gawk (hyperfine, 1 M lines) benchmarks/benchmark-results.md renders on GitHub
JIT-on vs JIT-off A/B (awkrs-only) benchmarks/benchmark-readme-jit.md renders on GitHub
Rust API docs (autogenerated) cargo doc --open https://docs.rs/awkrs

The HUD-themed HTML docs (docs/index.html, docs/report.html) share hud-static.css, hud-theme.js, and tutorial.css — open them locally via file:// or browse the GitHub Pages URL above.


[0xFF] LICENSE

┌──────────────────────────────────────────────────────────────┐ │ MIT LICENSE // UNAUTHORIZED REPRODUCTION WILL BE MET │ │ WITH FULL ICE │ └──────────────────────────────────────────────────────────────┘


░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░
░░ >>> JACK IN. MATCH THE PATTERN. EXECUTE THE ACTION. <<< ░░
░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░
created by MenkeTechnologies