█████╗ ██╗ ██╗██╗ ██╗██████╗ ███████╗
██╔══██╗██║ ██║██║ ██╔╝██╔══██╗██╔════╝
███████║██║ █╗ ██║█████╔╝ ██████╔╝███████╗
██╔══██║██║███╗██║██╔═██╗ ██╔══██╗╚════██║
██║ ██║╚███╔███╔╝██║ ██╗██║ ██║███████║
╚═╝ ╚═╝ ╚══╝╚══╝ ╚═╝ ╚═╝╚═╝ ╚═╝╚══════╝
[WORLDS FASTEST AWK BYTECODE ENGINE // PARALLEL RECORD PROCESSOR // RUST CORE]
"Pattern. Action. Domination."
awkrs runs pattern → action programs over input records like POSIX awk / GNU gawk / mawk, with a fused-superinstruction bytecode VM (plus a default-on fusevm/Cranelift offload path for eligible numeric chunks), parallel record processing, and a CLI that accepts the union of POSIX, gawk, and mawk options.
┌──────────────────────────────────────────────────────────────┐ │ STATUS: ONLINE THREAT LEVEL: NEON SIGNAL: ████████░░ │ └──────────────────────────────────────────────────────────────┘
Read the Docs · Engineering Report · strykelang · zshrs · fusevm
Table of Contents
- [0x00] System Scan
- [0x01] System Requirements
- [0x02] Installation
- [0x03] Language Coverage
- [0x04] Multithreading
- [0x05] Bytecode VM
- [0x06] Benchmarks
- [0x07] Build
- [0x08] Test
- [0x09] Documentation
- [0xFF] License
[0x00] SYSTEM SCAN
Positioning: POSIX awk + the gawk extensions that show up in real scripts (BEGINFILE/ENDFILE, coprocess |&, CSV mode, PROCINFO/SYMTAB/FUNCTAB, @include/@load/@namespace, /inet/tcp|udp, MPFR via -M). Performance goal: beat awk/mawk/gawk on supported workloads — see §0x06.
Bytecode cache: -f script.awk runs memoize compiled bytecode to ~/.awkrs/scripts.rkyv — repeat runs skip lex/parse/compile entirely. Source-file mtime and the awkrs-binary mtime invalidate entries silently. AWKRS_CACHE=0 disables it. Details in §0x05.
Implemented gawk-style CLI flags (where they differ from gawk, the gap is documented):
| Flag | Behavior |
|---|---|
-d/--dump-variables |
Dump globals after run (stdout, -, or file) |
-D/--debug |
Static rule/function listing — not gawk's interactive debugger |
-p/--profile |
Wall-clock summary + per-record-rule hit counts (-j 1 only) — not gawk's per-line profiler |
-o/--pretty-print |
AST pretty-print — not gawk's canonical reformatter |
-g/--gen-pot |
Print and exit before execution |
-L/-t/LINT |
Static lint (extension rules, uninit-var hints, printf format checks); when LINT is truthy at runtime, also emit awkrs: warning: on stderr for sqrt/log domain issues (negative / zero args) |
-S/--sandbox |
Block system(), file redirects, pipes, coprocesses, inet I/O |
-l name |
Load name.awk from AWKPATH (default .) |
-b |
Byte length for length/substr/index |
-n |
strtonum-style hex/octal coercion |
-s/--no-optimize |
Disable peephole/JIT optimization (forces the plain bytecode interpreter) |
-c/-P |
Stored on runtime; minimal effect today |
-r/--re-interval |
Parsed; no runtime effect (regex crate already supports {m,n}) |
-N/--use-lc-numeric |
Locale decimal radix and %' grouping in sprintf/printf/print. Does not affect string→number parsing |
Gawk parity gaps to know:
RS— newline by default; one (UTF-8) char = literal delimiter;RS=""= paragraph mode; multi-char = gawk regex (RTis the matched text).FIELDWIDTHSselects fixed-width when non-empty.PROCINFO— refreshed before and afterBEGIN. Includes gawk-styleplatform(posix/mingw/vms, not Rust'smacos/linux),version, ids,errno,api_major/api_minor,argv,identifiers,FS(active split mode),strftime,pgrpid,groupN,mb_cur_max(Linuxsysconf), per-inputREAD_TIMEOUT/RETRYcomposite keys with fallback chain → globalPROCINFO["READ_TIMEOUT"]→GAWK_READ_TIMEOUTenv. Unix primary record readspollwhen a timeout applies. With-M:gmp_version,mpfr_version,prec_min,prec_max. User-set keys persist across the post-BEGINrefresh.PROCINFO["sorted_in"]—@ind_*/@val_*modes, plus user comparator function (2-arg = index sort, 4-arg(i1, v1, i2, v2)= value sort). Returns negative/zero/positive likeqsort.SYMTAB— assignment,for-in,length(SYMTAB)like gawk's global introspection (not GNU's variable-object references).@load— non-.awkpaths only accepted for gawk's bundled extension names (filefuncs,readdir,time, …) as no-ops; the builtins are native. Arbitrary.so/gawkapi modules error at parse time.-M/--bignum— MPFR viarug(default 256 bits,PROCINFO["prec"]/["roundmode"]apply). Arithmetic,sprintf/printfinteger formats (no f64/i64 clamp),int/intdiv/strtonum/++/--, bit ops, transcendentals,srand(low 32 bits of previous seed),CONVFMT/OFMT/%s/concat/regex coercion all use MPFR. DefaultCONVFMT-style number→string for scalars uses eachFloat's own precision for the MPFRsprintfpath (so raisingPROCINFO["prec"]is not undermined by a hardcoded bit count at display time). JIT is disabled in-Mmode.- Unicode vs bytes:
-bhonored forlength/substr/index. Full multibyte field-splitting parity is not audited.
HELP // SYSTEM INTERFACE
[0x01] SYSTEM REQUIREMENTS
- Rust toolchain (
rustc+cargo) - A C compiler and
makeforgmp-mpfr-sys(pulled in byrugfor-M); typical macOS/Linux setups already satisfy this.
[0x02] INSTALLATION
&&
Zsh completion:
fpath=(/path/to/awkrs/completions )
&&
[0x03] LANGUAGE COVERAGE
Compatibility matrix: BSD awk, mawk, and gawk vs awkrs.
┌──────────────────────────────────────────────────────────────┐ │ SUBSYSTEM: LEXER ████ PARSER ████ COMPILER ████ VM ████ │ └──────────────────────────────────────────────────────────────┘
- Rules:
BEGIN,END,BEGINFILE/ENDFILE, empty pattern,/regex/, expression patterns, range patterns (/a/,/b/orNR==1,NR==5). Like gawk, the four special patterns must use{ … }; record rules may omit braces for the default{ print $0 }. - Statements:
if/while/do…while/for(C-style andfor (i in arr)),switch/case/default(gawk-style: no fall-through, regexcase /re/),print/printf(with>,>>,|,|&redirection),break,continue,next,nextfile,exit,delete,return,getline(primary,< file,<& cmd,expr | getline [var]). getlineas expression: value1(read),0(EOF),-1(error),-2(gawk retryable I/O whenPROCINFO[input,"RETRY"]is set).- Operators: arithmetic, comparison, string concat, ternary,
in,~/!~,++/--(prefix/postfix on vars,$n,a[k]),^/**(right-associative; unary+/-/!bind looser, so-2^2=-(2^2)). Division by zero (/, compound/=) is a fatal error (gawk-style:division by zero attempted), not infinity. - Primary
/: The lexer may emit/as division whenregex_modeis false (e.g. after=). At a primary position/cannot start division (division is a binary operator), so the parser re-reads it as/regex/. In expression context a bare/re/means$0 ~ /re/(POSIX), so e.g.!/foo/,if (/foo/), andx = /foo/use a match against the current record, not a string literal. The RHS of~/!~and regexp arguments togsub/sub/gensub/match/split/patsplitstill treat/re/as the pattern only (sob/cin a replacement string stays division). - gawk regexp constants:
@/pattern/yields a regexp value (typeofreportsregexp);~uses the pattern as a regex. - Data: fields, scalars, associative arrays (
a[k],a[i,j]withSUBSEP),ARGC/ARGV(set beforeBEGIN;ARGV[0]is the executable,ARGV[1..]are file paths).FS(regex when multi-char),FPAT(gawk-style: non-empty splits by regex match),split/patsplit(3rd arg accepts regex;patsplit4-arg form populatesseps). POSIX record model:NF = ntruncates or extends fields and rebuilds$0withOFS;$0 = "…"re-splits and updatesNF.FS/FPATfrom literals: bytecode may store source"…"as an internal literal string, butcached_fssync on read still tracks those assignments. Whole array in a scalar context (print a, string concat,printfargs,~operands, etc.) is a fatal runtime error (gawk-style), not a silent empty string. Scalars: uninitialized variables compare like numeric0where POSIX expects dual 0/"". String constants vs input: program string literals are not numeric strings for</<=/>/>=the way$ncan be; arithmetic still uses longest-prefix string→number ("3.14abc"+0→3.14).split("", arr, fs)returns0(no empty pseudo-field). - Records & env:
RS/RTas documented above.ENVIRON,CONVFMT,OFMT,FIELDWIDTHS,IGNORECASE(case-insensitive regex +==/!=/ordering viastrcoll),ARGIND,ERRNO,LINT,TEXTDOMAIN,BINMODE.PROCINFO/FUNCTAB/SYMTABas in §0x00. - CLI extensions:
-k/--csvenables CSV mode (RFC-style quoting,""escape) — setsFS/FPATand uses a dedicated parser aligned withgawk --csv. - Builtins:
length,index(empty needle →1, matching gawk),substr(gawk rule, not POSIX: ifstart < 1, clamp to1and leavelengthunchanged — POSIX shortenslengthby1 - start),intdiv,mkbool,split,sprintf/printf(flags,*and%n$positional, gawk%', conversions%s %d %i %u %o %x %X %f %e %E %g %G %c %%—%e/%Euse signed two-digit exponents;%cuses a string’s first character),gsub/sub/match,gensub,isarray,tolower/toupper,int, math (sincosatan2explogsqrt),rand/srand,systime,strftime(0–3 args),mktime,system,close,fflush, bit ops (andorxorlshiftrshiftcompl),strtonum,asort/asorti. User-definedfunctionwith parameter locals. - Static checks: Before bytecode emission, the compiler rejects parenthesized comma lists except in
print/printfarguments and(… ) in arrkeys (e.g.(1,2)alone as a statement is an error).gsub/subrequire at least two arguments;split/matchat least two;patsplittwo to four;gensubthree or four — otherwise a clear runtime-style error is returned at compile load time. User-defined recursion is capped (256 nested calls in release builds; lower in unit tests) so pathological self-calls fail with an error instead of overflowing the host stack. - Expressions: integer literals use gawk rules in source —
0x/0Xhex; leading0octal when all digits are0–7(otherwise decimal, e.g.01238→ 1238); floats with a.use a decimal integer part (077.5→ 77.5). Multidimensional membership(i,j) in arruses a parenthesized comma list (gawk); it may appear alone as aprintargument to emit several fields. - I/O model: main record loop and unredirected
getlineshare oneBufReaderso line order matches POSIX.exitfromBEGINor a pattern action still runsENDrules, then exits with the requested code. - Locale & pipes: Unix string compare/order uses
strcoll(LC_COLLATE/LC_ALL).|&and<&run undersh -c(mixing|and|&on the same command is an error). With-N,LC_NUMERICapplies tosprintf/printffloats and%'grouping; without-N,%'still useslocaleconv()'s thousands separator (fallback,).-Ndoes not affect parsing of numeric strings from input. - Gawk extras:
@include,@load "*.awk",@namespace "…"(default identifier prefixing; built-ins exempt), indirect calls (@name(…)/@(expr)(…)),/inet/tcp/…and/inet/udp/…client sockets, gettext builtins (bindtextdomain,dcgettext,dcngettextwith.mocatalogs via thegettextcrate),-M/--bignumMPFR.
[0x04] MULTITHREADING // PARALLEL EXECUTION GRID
┌─────────────────────────────────────────────┐
│ WORKER 0 ▓▓ CHUNK 0 ██ REORDER QUEUE │
│ WORKER 1 ▓▓ CHUNK 1 ██ ──────────────>│
│ WORKER 2 ▓▓ CHUNK 2 ██ DETERMINISTIC │
│ WORKER N ▓▓ CHUNK N ██ OUTPUT STREAM │
└─────────────────────────────────────────────┘
Default -j/--threads is 1. Pass a higher value when the program is parallel-safe (static check: no range patterns, no exit/nextfile/delete, no primary getline, no pipe/coproc getline, no asort/asorti, no indirect calls, no print/printf redirection, no cross-record assignments). Records are processed in parallel via rayon and output is reordered to input order within each batch so pipelines stay deterministic.
Regular files are memory-mapped (memmap2) and scanned with the same RS rules as the sequential path — no read() copy of the whole file. Stdin parallel chunks up to --read-ahead lines (default 1024) per batch, dispatches to workers, emits in order, then refills.
Workers run the same bytecode VM as the sequential path. The compiled program is shared via Arc<CompiledProgram> (one compile, cheap refcount per worker) with per-worker runtime state.
Fallback: non-parallel-safe programs run sequentially with a warning when -j > 1. Programs that use primary getline (including in BEGIN) also run sequentially for file input. END only sees post-BEGIN global state — record-rule mutations from parallel workers are not merged.
[0x05] BYTECODE VM // EXECUTION CORE
┌──────────────────────────────────────────────────────────────┐ │ ARCHITECTURE: STACK VM OPTIMIZATION: PEEPHOLE FUSED │ └──────────────────────────────────────────────────────────────┘
awkrs compiles AWK programs into a flat bytecode instruction stream and runs them on a stack VM. Short-circuit &&/||, control flow, and range patterns resolve to jump-patched offsets at compile time. The string pool interns variable names and string constants for cheap u32 indexing.
fusevm offload (on by default): eligible numeric bytecode chunks are lowered to fusevm — the shared bytecode VM also used by zshrs and strykelang — and run on fusevm::VM. Set AWKRS_FUSEVM=0 to force the bytecode interpreter for every chunk. src/fusevm_bridge.rs translates an eligible chunk (is_fusevm_eligible) into a fusevm::Chunk (build_numeric_chunk), runs it, and writes modified slots back into the awkrs runtime. Each per-record chunk is a stable chunk (no baked-in seed preamble); the accumulator's prior value is seeded as data into the VM's base frame before run(), so the chunk — and therefore its op-hash — is identical across every record, which is what lets the JIT-compiled native code be reused across records and across processes. int(x) is admitted into the numeric chunk and lowers to the native fusevm::Op::AwkInt (Cranelift trunc) — the first AWK builtin reachable end-to-end from awkrs into fusevm's native AWK-op set and JIT. The transcendental math builtins sin/cos/exp/atan2 are likewise admitted, lowering to native fusevm::Op::AwkSin/AwkCos/AwkExp/AwkAtan2 — Cranelift libcalls to small Rust helpers that canonicalize a NaN result to +nan, matching awkrs/gawk's NaN-sign normalization. The gawk bitwise builtins and/or/xor (variadic, ≥2 args) are admitted too, lowering to native fusevm::Op::AwkAnd/AwkOr/AwkXor (Cranelift band/bor/bxor over a saturating f64→i64 conversion that matches awkrs's num_to_u64 for huge/NaN operands). sqrt/log stay interpreter-side because they emit a host stderr warning on a negative argument that a pure native op cannot reproduce; lshift/rshift/compl stay interpreter-side because they raise a fatal on a negative argument (a value-dependent trap), unlike the trap-free and/or/xor. Eager block-JIT compilation for offloaded loops: a BEGIN/END loop chunk offloaded via run_fusevm_region calls fusevm::VM::run() exactly once, so under fusevm's normal warm-up threshold (compile after N invocations) the block JIT would never compile it and the whole loop would run on fusevm's slower interpreter. Because eligible_loop_prefix only selects a region that contains a backward jump (a genuinely hot loop), the bridge forces the block-JIT threshold to 0 (compile on the first invocation) around that single run(), so the loop compiles to native code immediately. Measured: a 20M-iteration s += sin(i) loop dropped from ~12.8s (interpreter) to ~0.15s, and an x = and(x, i) + 1 loop from ~14.0s to ~0.13s — ~85–110× — both bit-for-bit identical to AWKRS_FUSEVM=0. Division and modulo are offloaded and block-JIT-compiled: a chunk containing /, %, or compound /=/%= lowers to the native trapping ops fusevm::Op::AwkDivJit/AwkModJit, which the block JIT compiles with a guarded zero-divisor early-exit (fcmp eq divisor, 0.0 → a trap libcall that raises the fatal division by zero attempted / …in %'for modulo, elsefdiv/fmod). So a hot division loop runs as native code — a 20M-iteration division loop dropped from ~8.0s (interpreter) to ~0.12s, ~68× — while a zero divisor reached inside the compiled loop still raises the POSIX fatal instead of producing inf/NaNor hanging. The trap libcall is not a registered host-helper id, soAwkDivJit/AwkModJitchunks JIT in-process only and skip the on-disk cache (no schema impact). fusevm also still defines the interpreter-only trappingOp::AwkDiv/AwkMod(distinct from its shell-arithmeticOp::Div/Op::Modused byzshrs/strykelang); awkrs emits the *Jit` variants.
fusevm persistent JIT cache (~/.cache/fusevm-jit): awkrs builds fusevm with the jit-disk-cache feature (see Cargo.toml), so when the fusevm path is active fusevm's Cranelift tiers persist native-compiled code to ~/.cache/fusevm-jit and reload it across processes — the same on-disk JIT cache zshrs uses. Two tiers write there: the block JIT compiles a fully-eligible per-record numeric chunk to native code (*.blk.fjit), and the tracing JIT compiles hot in-chunk loop traces (*.trc.fjit). This is distinct from awkrs's own ~/.awkrs/scripts.rkyv bytecode cache (below): one caches fusevm-emitted machine code keyed by chunk op-hash (and slot-kind hash), the other caches awkrs bytecode keyed by source. Override the directory with FUSEVM_JIT_CACHE_DIR (off disables); cap it with FUSEVM_JIT_CACHE_MAX_BYTES. fusevm's block JIT round-trips awk's f64 slots through its SlotKind::Float bit-pattern model (a slot is an i64 holding the raw f64 bits; GetSlot/SetSlot bitcast through f64), so an x = int(x + c) accumulator now block-JIT-compiles natively and caches a *.blk.fjit artifact reused on the next run — verified producing the artifact and matching the bytecode interpreter bit-for-bit. Coverage is still partial — the tracing JIT does not yet hot-trace every top-tested for/while shape, and only the numeric chunk set below is eligible. So the cache engages for the chunks fusevm can compile and is inert for the rest. The offload is on by default (AWKRS_FUSEVM=0 disables it); with the block JIT's f64-slot support (below) and eager compilation of offloaded loop regions (above) it is a large win on JIT-compilable numeric loops — measured 14–110× over the bytecode interpreter for pure-arithmetic and builtin (sin/and/…) accumulator loops — while staying bit-for-bit faithful to the bytecode interpreter, and ineligible chunks run on the interpreter exactly as before. The default execution path for ineligible chunks is the bytecode interpreter with peephole-fused superinstructions; AWKRS_JIT=0 (or -s/--no-optimize) disables peephole optimization too.
fusevm bridge per-record caches (in-process): the persistent on-disk cache above caches compiled native code keyed by fusevm-op-hash; the bridge ALSO maintains in-process caches on Runtime that catch the upstream work the disk cache can't touch. (1) fuse_chunk_cache: HashMap<(chunk_ptr, bignum), Option<Arc<(fusevm::Chunk, Vec<u16>)>>> caches the built fusevm::Chunk per awkrs Chunk so build_numeric_chunk's eligibility check + 2-pass op→fusevm translation only runs once per (chunk, bignum), not per record. (2) fuse_last_chunk_key/fuse_last_chunk_value form a single-slot side-table that hoists the HashMap lookup out of the per-record path entirely for the common single-rule case (one tuple compare + Arc::clone, no HashMap touch). (3) fuse_vm_pool: fusevm::VMPool recycles fusevm::VM instances across records — VM::reset(chunk) preserves Vec capacities (stack, frames, slot_buf, globals) so subsequent records reuse the underlying allocations. (4) The cache value's Vec<u16> is the precomputed write-slot set — per-record writeback only walks Op::SetSlot targets instead of all N runtime slots. (5) Slot seeding goes direct from ctx.rt.slots into vm.set_slot with no intermediate Vec<f64> allocation. Cumulative win on the awkrs JIT path: 11% on count_gt_5m over 10M records, 6% on compound_pred (release best-of-3). The JIT path is still net-slower than the awkrs interpreter on tight one-op-body micro-benches — the remaining gap is fusevm's per-call VM setup (chunk move into the VM, JIT entry overhead) which would need fusevm-side API changes (Arc-aware VM::reset) to cut further.
Eligible chunks (fusevm offload): pure-numeric bodies — constants, arithmetic/comparisons, ++/--, compound assignments (+=/-=/*=//=/%=/^=), division/modulo (//%, native trapping Op::AwkDivJit/AwkModJit with a guarded zero-divisor early-exit), jumps and fused loop tests, scalar slot reads/writes whose values stay numeric, int(x) (native Op::AwkInt), the transcendentals sin/cos/exp/atan2 (native Op::AwkSin/AwkCos/AwkExp/AwkAtan2, NaN→+nan), and the bitwise builtins and/or/xor (native Op::AwkAnd/AwkOr/AwkXor, saturating f64→i64). When such a chunk is a hot loop offloaded as a region, the block JIT is eager-compiled on the first run() (see above) so the native code is used immediately rather than after a warm-up. Chunks that touch strings, fields, arrays, regexes, getline, print, user calls, or other builtins (including sqrt/log, which warn on negative args, and lshift/rshift/compl, which raise a fatal on negative args) are ineligible. -M/--bignum disables the path entirely (MPFR values can't be represented as f64 slots). Consulted by default (AWKRS_FUSEVM=0 forces everything onto the bytecode interpreter).
Peephole fusion combines common sequences into single opcodes:
print $N→PrintFieldStdout(zero-alloc field write)s += $N→AddFieldToSlot(in-place numeric parse)i = i + 1/i++/++i→IncrSlot(one numeric add, no stack traffic)s += ibetween slots →AddSlotToSlot$1 "," $2literal concat →ConcatPoolStrNR++HashMap-path →IncrVar
Inline fast paths bypass VmCtx entirely for single-rule programs with one fused opcode ({ print $1 }, { s += $1 }). Memory-mapped files also recognize { gsub("lit", "repl"); print } with literal pattern: when the needle is absent, the loop writes each line from the mapped buffer with ORS and skips the VM.
Bytecode cache: -f script.awk invocations memoize the compiled CompiledProgram to ~/.awkrs/scripts.rkyv — an rkyv-archived shard with mmap + zero-copy ArchivedHashMap lookup on the read path (check_archived_root validation) and flock-serialized atomic-rename writes. Each entry's inner CompiledProgram blob is bincode (rkyv outer, bincode inner — same architecture as zshrs/stryke). Repeat runs skip lex/parse/compile entirely — only the matched entry's blob is decoded. Entries are invalidated on source-file mtime change or when the running awkrs binary is newer than the cached entry (any rebuild silently rebuilds the cache). Disable with AWKRS_CACHE=0. The cache only engages for the simple -f script.awk form — inline -e/--source, -E, -i/--include, -l/--load, --debug, --lint, --pretty-print, and --gen-pot skip the cache because they need the AST.
Based on a survey of the major public awk implementations (BWK awk, gawk, mawk, goawk, frawk, zawk), awkrs appears to be the first awk implementation to pair a bytecode VM with a persistent on-disk bytecode cache. frawk is the closest prior art on JIT — it has VM + Cranelift/LLVM JIT — but re-compiles on every invocation; its overview and README contain no mention of disk-persisted compiled artifacts. gawk's pm-gawk persists script-defined variables and functions across runs, not compiled bytecode — different feature. (awkrs's own fusevm/Cranelift offload is on by default for eligible numeric chunks and, with eager block-JIT compilation of offloaded loop regions, is now a large win on those loops — 14–110× over the bytecode interpreter — though it does not engage for the string/field-dominated programs most awk workloads run, so it is not claimed as the headline feature here; note the fusevm offload additionally carries its own separate persistent machine-code cache at ~/.cache/fusevm-jit, distinct from the bytecode cache claimed here.)
| Implementation | Bytecode VM | JIT | Persistent bytecode cache |
|---|---|---|---|
| BWK awk (one-true-awk) | ✗ tree-walker | ✗ | ✗ |
| gawk | ✓ | ✗ | ✗ (pm-gawk is for vars) |
| mawk | ✓ | ✗ | ✗ |
| goawk | ✓ | ✗ | ✗ |
| frawk | ✓ | ✓ Cranelift + LLVM | ✗ |
| zawk (frawk fork) | ✓ | ✓ Cranelift + LLVM | ✗ |
| awkrs | ✓ | ◐ fusevm/Cranelift (on by default; block + tracing tiers, ~/.cache/fusevm-jit) |
✓ |
Raw byte field extraction: print $N with default FS scans raw bytes in the mapped file buffer to find the Nth whitespace field, writes it to the output buffer, and appends Runtime::ors_bytes — no record copy, no UTF-8 validation.
Other optimizations:
- Indexed slots: scalars get
u16slot indices; reads/writes are flat-array indexing instead ofHashMaplookups (specials likeNR/FS/OFSand array names stay on the HashMap path). - Zero-copy fields: fields stored as
(u32, u32)byte ranges into the record string; ownedStrings only onset_field. - Direct-to-buffer print: stdout writes go straight into a 64 KB
Vec<u8>(flushed at file boundaries) — no per-recordString,format!(), or stdout locking. - Cached separators:
OFS/ORSbytes cached on the runtime, updated only on assignment. The direct-to-buffer stdoutprintpath uses the fullofs_bytes/ors_bytesslices (arbitrary length; not capped at 64 bytes). - Byte-level input:
read_until(b'\n')into a reusableVec<u8>skips per-line UTF-8 validation. - Regex cache: compiled
Regexobjects cached in aHashMap<String, Regex>. sub/gsub: when target is$0, applies the new record in one step. Literal needles reuse a cachedmemmem::Finder. Constant string operands pass viaCow(no per-call alloc).parse_number: fast-paths plain decimal integer field text before falling back tostr::parse::<f64>().- Slurped input: newline scanning uses
memchr. - Parallel: compiled program shared via
Arcacross rayon workers (zero-copy).
[0x06] BENCHMARKS // COMBAT METRICS (vs awk / gawk / mawk)
┌──────────────────────────────────────────────────────────────┐ │ HARDWARE: APPLE M5 MAX OS: macOS ARCH: arm64 │ └──────────────────────────────────────────────────────────────┘
Measured with hyperfine. BSD awk (/usr/bin/awk), GNU gawk 5.4.0, mawk 1.3.4, awkrs (see Cargo.toml for current version). Relative = mean ÷ fastest mean in that table. awkrs has two rows: default (JIT attempted) vs AWKRS_JIT=0 (bytecode only). Each table is one hyperfine invocation across all five commands on the same 1 M-line input, generated 2026-04-10 UTC by ./scripts/benchmark-vs-awk.sh and copied verbatim from benchmarks/benchmark-results.md. For the awkrs-only JIT-vs-bytecode A/B see benchmarks/benchmark-readme-jit.md.
Caveat (2026-06-01): the
awkrs (JIT)rows below were generated against awkrs's former in-tree Cranelift module (src/jit.rs), which has since been removed; JIT now means thefusevm/Cranelift offload (on by default;AWKRS_FUSEVM=0disables it), which does not engage for these string/field programs anyway (they are ineligible). The default path for such programs is the fused-superinstruction bytecode interpreter — effectively theawkrs (bytecode)row. Re-run./scripts/benchmark-vs-awk.shto regenerate.
1. Throughput: { print $1 } over 1 M lines
| Command | Mean | Min | Max | Relative |
|---|---|---|---|---|
| BSD awk | 195.0 ms | 179.8 ms | 221.6 ms | 12.43× |
| gawk | 100.8 ms | 92.8 ms | 115.8 ms | 6.42× |
| mawk | 66.2 ms | 61.9 ms | 78.4 ms | 4.22× |
| awkrs (JIT) | 15.7 ms | 13.3 ms | 19.6 ms | 1.00× |
| awkrs (bytecode) | 16.1 ms | 13.1 ms | 20.2 ms | 1.03× |
2. CPU-bound BEGIN (no input)
BEGIN { s = 0; for (i = 1; i < 400001; i = i + 1) s += i; print s }
| Command | Mean | Min | Max | Relative |
|---|---|---|---|---|
| BSD awk | 15.8 ms | 14.0 ms | 18.6 ms | 1.71× |
| gawk | 20.7 ms | 18.8 ms | 22.9 ms | 2.24× |
| mawk | 9.7 ms | 8.3 ms | 11.4 ms | 1.06× |
| awkrs (JIT) | 9.2 ms | 8.4 ms | 12.0 ms | 1.00× |
| awkrs (bytecode) | 9.6 ms | 8.2 ms | 12.0 ms | 1.04× |
3. Sum first column ({ s += $1 } END { print s }, 1 M lines)
Cross-record state is not parallel-safe, so awkrs stays single-threaded here. On regular-file input, awkrs uses a raw byte path: parses the Nth whitespace field directly from the mmap'd buffer.
| Command | Mean | Min | Max | Relative |
|---|---|---|---|---|
| BSD awk | 158.5 ms | 147.0 ms | 172.7 ms | 12.27× |
| gawk | 62.9 ms | 58.4 ms | 68.9 ms | 4.87× |
| mawk | 37.5 ms | 33.7 ms | 39.9 ms | 2.90× |
| awkrs (JIT) | 13.0 ms | 11.9 ms | 15.4 ms | 1.01× |
| awkrs (bytecode) | 12.9 ms | 11.5 ms | 16.1 ms | 1.00× |
4. Multi-field print ({ print $1, $3, $5 }, 1 M lines, 5 fields/line)
| Command | Mean | Min | Max | Relative |
|---|---|---|---|---|
| BSD awk | 647.6 ms | 623.5 ms | 686.3 ms | 11.60× |
| gawk | 266.1 ms | 257.4 ms | 301.8 ms | 4.77× |
| mawk | 156.6 ms | 149.8 ms | 170.7 ms | 2.81× |
| awkrs (JIT) | 56.4 ms | 53.1 ms | 61.8 ms | 1.01× |
| awkrs (bytecode) | 55.8 ms | 53.4 ms | 61.6 ms | 1.00× |
5. Regex filter (/alpha/ { c += 1 } END { print c }, 1 M lines, no matches)
| Command | Mean | Min | Max | Relative |
|---|---|---|---|---|
| BSD awk | 191.8 ms | 180.1 ms | 208.9 ms | 17.31× |
| gawk | 351.4 ms | 342.7 ms | 363.3 ms | 31.72× |
| mawk | 19.3 ms | 17.5 ms | 21.8 ms | 1.74× |
| awkrs (JIT) | 11.1 ms | 9.5 ms | 13.5 ms | 1.00× |
| awkrs (bytecode) | 11.1 ms | 9.5 ms | 14.6 ms | 1.00× |
6. Associative array ({ a[$5] += 1 } END { for (k in a) print k, a[k] }, 1 M lines)
| Command | Mean | Min | Max | Relative |
|---|---|---|---|---|
| BSD awk | 826.2 ms | 792.2 ms | 896.0 ms | 2.43× |
| gawk | 342.4 ms | 330.6 ms | 362.5 ms | 1.01× |
| mawk | 610.0 ms | 588.9 ms | 648.7 ms | 1.79× |
| awkrs (JIT) | 340.0 ms | 324.2 ms | 377.7 ms | 1.00× |
| awkrs (bytecode) | 343.7 ms | 323.5 ms | 356.7 ms | 1.01× |
7. Conditional field (NR % 2 == 0 { print $2 }, 1 M lines, 2 fields/line)
| Command | Mean | Min | Max | Relative |
|---|---|---|---|---|
| BSD awk | 289.1 ms | 263.1 ms | 321.1 ms | 9.58× |
| gawk | 116.1 ms | 111.0 ms | 124.4 ms | 3.85× |
| mawk | 71.1 ms | 66.9 ms | 83.6 ms | 2.36× |
| awkrs (JIT) | 30.2 ms | 28.1 ms | 34.0 ms | 1.00× |
| awkrs (bytecode) | 30.7 ms | 28.0 ms | 35.5 ms | 1.02× |
8. Field computation ({ sum += $1 * $2 } END { print sum }, 1 M lines, 2 fields/line)
On regular-file input with default FS, awkrs extracts both fields in a single byte scan and parses them as numbers directly from the mmap'd buffer.
| Command | Mean | Min | Max | Relative |
|---|---|---|---|---|
| BSD awk | 261.8 ms | 251.4 ms | 280.8 ms | 13.96× |
| gawk | 100.5 ms | 95.3 ms | 109.5 ms | 5.36× |
| mawk | 57.7 ms | 54.5 ms | 61.1 ms | 3.08× |
| awkrs (JIT) | 19.0 ms | 17.6 ms | 23.0 ms | 1.01× |
| awkrs (bytecode) | 18.8 ms | 17.5 ms | 22.8 ms | 1.00× |
9. String concat print ({ print $3 "-" $5 }, 1 M lines, 5 fields/line)
| Command | Mean | Min | Max | Relative |
|---|---|---|---|---|
| BSD awk | 640.8 ms | 611.9 ms | 689.3 ms | 12.68× |
| gawk | 182.2 ms | 168.1 ms | 197.2 ms | 3.61× |
| mawk | 121.0 ms | 113.6 ms | 128.1 ms | 2.39× |
| awkrs (JIT) | 51.0 ms | 49.2 ms | 53.8 ms | 1.01× |
| awkrs (bytecode) | 50.5 ms | 48.8 ms | 54.8 ms | 1.00× |
10. gsub ({ gsub("alpha", "ALPHA"); print }, 1 M lines, no matches)
Lines do not contain alpha, so this measures no-match gsub plus print. On regular-file input, awkrs uses a slurp inline path: byte memmem scan + print with no VM or per-line set_field_sep_split when the literal is absent.
| Command | Mean | Min | Max | Relative |
|---|---|---|---|---|
| BSD awk | 291.5 ms | 282.3 ms | 300.4 ms | 21.15× |
| gawk | 436.3 ms | 425.7 ms | 459.3 ms | 31.66× |
| mawk | 74.3 ms | 68.8 ms | 84.2 ms | 5.39× |
| awkrs (JIT) | 13.8 ms | 12.8 ms | 16.2 ms | 1.00× |
| awkrs (bytecode) | 13.9 ms | 12.7 ms | 17.6 ms | 1.01× |
AWKRS_BENCH_LINES=5000000
Demo scripts
Quick tours over the feature surface — runnable against the debug build (cargo build); each script auto-builds if the binary is missing.
# gensub, match() capture, pipes, getline, parallel, JIT toggle)
# ps → top RSS, /etc/passwd → shell histogram
LINES=200000 # JIT default vs AWKRS_JIT=0 bytecode
NO_COLOR=1 strips ANSI from the demo output. The applied demo writes its inputs to $TMPDIR and cleans up on exit.
Deep examples (examples/*.awk)
Substantive standalone awk programs that exercise recursion, multidim arrays via SUBSEP, PROCINFO["sorted_in"], three-arg match(), asort, and bit ops. Each <name>.awk ships with <name>.in (stdin input). Every example is byte-for-byte verified against gawk by the parity job in CI (bash parity/run_parity.sh gawk), so they double as gawk-extension regression tests.
| File | What it shows |
|---|---|
bst.awk |
BST insert + inorder / preorder / postorder traversals via recursion |
heap_sort.awk |
Min-heap push / pop → heapsort |
trie.awk |
Trie membership + prefix-count using SUBSEP two-level keys |
levenshtein.awk |
O(la·lb) edit-distance DP table held in a SUBSEP 2D array |
calc_rd.awk |
Recursive-descent arithmetic parser (+ - * / % ^ unary ( ), right-assoc ^) |
topo_sort.awk |
Kahn's algorithm topological sort + cycle detection |
brainfuck.awk |
Brainfuck interpreter (precomputed bracket map, modulo-256 cells) |
rpn.awk |
Postfix calculator (dup / swap / drop / neg + arith) on an explicit stack |
json_pretty.awk |
JSON tokeniser → 2-space indented pretty-printer |
hexdump.awk |
xxd-style hex + ASCII dump (16-byte rows with offset, hex, gutter, ascii) |
csv_pivot.awk |
CSV → per-group min / max / mean / total aggregation |
graph_bfs.awk |
Undirected BFS from a source + path reconstruction |
markov.awk |
Bigram model → top-3 continuations + deterministic 12-step walk |
sql_like.awk |
Mini-SQL on CSV: SELECT … WHERE … GROUP BY … SUM/AVG/COUNT … ORDER BY … |
sudoku.awk |
9×9 Sudoku solver via recursive backtracking (row/col/box witness sets) |
regex_engine.awk |
Recursive regex matcher (. * + ? ^ $ [...] \) written in awk |
dijkstra.awk |
Single-source shortest paths with a binary min-heap PQ |
kruskal.awk |
MST via union-find (path halving + union-by-rank) |
maze_bfs.awk |
BFS shortest path through an ASCII maze; path overlaid with * |
diff_lcs.awk |
LCS-based unified diff with recursive traceback |
n_queens.awk |
N-queens backtracking with col / diag1 / diag2 constant-time witnesses |
kmp.awk |
Knuth-Morris-Pratt substring search (failure function + scan) |
intervals.awk |
Merge overlapping closed intervals; report total covered length |
roman.awk |
Roman numerals ↔ integers, subtractive form, range 1..3999 |
knapsack.awk |
0/1 knapsack DP table + traceback to recover chosen items |
prime_sieve.awk |
Sieve of Eratosthenes up to N + ten-per-row pretty printing |
floyd_warshall.awk |
All-pairs shortest paths (negative weights allowed) |
bellman_ford.awk |
Single-source shortest paths + negative-cycle detection |
scc_tarjan.awk |
Tarjan's strongly connected components (recursion + lowlink) |
conway.awk |
Conway's Game of Life — fixed-grid evolution for N generations |
a_star.awk |
A* on an ASCII grid (Manhattan heuristic, tuple-keyed min-heap PQ) |
base64.awk |
RFC-4648 base64 encode + decode (no external tools) |
base_conv.awk |
Integer base conversion 2..36 in either direction |
rle.awk |
Run-length encode + decode (whitespace preserved) |
vigenere.awk |
Vigenère cipher (encrypt + decrypt; case preserved) |
bigint_mul.awk |
Arbitrary-precision multiplication via schoolbook on digit arrays |
lru_cache.awk |
LRU cache: O(1) get + put via doubly-linked list + hash map |
segment_tree.awk |
Iterative segment tree (point update + range-sum query) |
fenwick.awk |
Fenwick / Binary Indexed Tree (prefix + range sums) |
convex_hull.awk |
Convex hull via Andrew's monotone chain + shoelace area |
lis.awk |
Longest increasing subsequence in O(n log n) with traceback |
huffman.awk |
Huffman coding — build prefix tree, encode + round-trip decode |
manacher.awk |
Manacher's longest palindromic substring in O(n) |
subset_sum.awk |
Subset-sum DP + reconstruction of one valid subset |
permutations.awk |
Heap's algorithm — enumerate all n! permutations |
tsp_dp.awk |
Held-Karp bitmask DP for travelling salesman (≤15 cities) |
anagrams.awk |
Group anagrams by sorted-letter signature |
rule30.awk |
Wolfram Rule 30 elementary 1D cellular automaton |
aho_corasick.awk |
Aho-Corasick multi-pattern search (goto/fail/output trie) |
z_function.awk |
Z-array — linear-time prefix-match table + pattern search |
rabin_karp.awk |
Rolling-hash substring search (Rabin-Karp) |
shunting_yard.awk |
Dijkstra's shunting-yard infix → postfix → evaluate |
modexp.awk |
Modular exponentiation + deterministic Miller-Rabin primality |
gcd_extended.awk |
Extended Euclidean algorithm + modular inverse |
mandelbrot.awk |
ASCII Mandelbrot escape-time render |
ini_parser.awk |
INI config parser with sections / comments / globals |
url_parser.awk |
URL decomposer (scheme / user / pass / host / port / path / query / fragment) |
tictactoe.awk |
Minimax tic-tac-toe solver — best move + outcome from any position |
coin_change.awk |
Min-coins DP + reconstruction of one optimal combination |
prim_mst.awk |
Prim's MST via lazy linear frontier scan |
boyer_moore.awk |
Boyer-Moore substring search (bad-character heuristic) |
suffix_array.awk |
Suffix array + LCP (lex-sort-based) for each input line |
avl_tree.awk |
AVL self-balancing BST with insert, inorder, height, balance-check |
quickselect.awk |
Kth smallest element via Hoare partitioning |
horner.awk |
Horner's method: evaluate, derivative, synthetic division |
pollard_rho.awk |
Pollard's rho integer factorization + Miller-Rabin |
lzw.awk |
LZW compression encode + decode (256-byte dictionary start) |
markdown_basic.awk |
Markdown → HTML for a subset (headings, lists, code, links) |
date_calc.awk |
Day-of-week (Zeller), calendar generator, date difference |
gauss_elim.awk |
Gaussian elimination with partial pivoting (Ax = b) |
twenty48.awk |
2048 board: apply L/R/U/D moves, merge tiles, track score |
email_extract.awk |
Find emails in free text + sorted unique tally |
Run any example directly:
[0x07] BUILD // COMPILE THE PAYLOAD
awkrs --help / -h prints a cyberpunk HUD (ASCII banner, status box, taglines, footer) in the style of MenkeTechnologies tp -h. ANSI colors apply when stdout is a TTY; set NO_COLOR to force plain text.
Regenerate the help screenshot after UI changes: ./scripts/gen-help-screenshot.sh (needs termshot on PATH and a prior cargo build). The capture runs on a PTY with NO_COLOR unset and renders at 256 columns.
[0x08] TEST // INTEGRITY VERIFICATION
CI runs on pushes and pull requests to main via GitHub Actions: one Ubuntu lint job (cargo fmt --check, cargo clippy -D warnings, cargo doc with RUSTDOCFLAGS=-D warnings) plus a build/test matrix on Ubuntu and macOS.
Coverage spans library unit tests for every module (lexer, parser, format, builtins, interp, vm, jit, compiler, runtime, locale, cli, cyber_help) and integration suites under tests/ that exercise the gawk-style additions, the slurped-input path, parallel record behavior, and the full CLI surface. Cross-feature combinations (CSV + ENDFILE, paragraph RS="" + getline, FIELDWIDTHS + NF reassignment, ...) live in tests/cross_feature_integration.rs.
[0x09] DOCUMENTATION // RENDERED HTML + MARKDOWN
docs/ is published to GitHub Pages on every push to main and is the authoritative source for the rendered reference + engineering report.
| Doc | Source | Live URL |
|---|---|---|
| User reference (quickstart, builtins, variables, examples, cache + parallel notes) | docs/index.html |
https://menketechnologies.github.io/awkrs/ |
| Engineering report (architecture, module table, perf stack, divergence ledger, competitive matrix) | docs/report.html |
https://menketechnologies.github.io/awkrs/report.html |
| Compatibility matrix vs BSD awk / mawk / gawk | docs/COMPATIBILITY.md |
renders on GitHub |
| Benchmarks vs BSD awk / mawk / gawk (hyperfine, 1 M lines) | benchmarks/benchmark-results.md |
renders on GitHub |
| JIT-on vs JIT-off A/B (awkrs-only) | benchmarks/benchmark-readme-jit.md |
renders on GitHub |
| Rust API docs (autogenerated) | cargo doc --open |
https://docs.rs/awkrs |
The HUD-themed HTML docs (docs/index.html, docs/report.html) share hud-static.css, hud-theme.js, and tutorial.css — open them locally via file:// or browse the GitHub Pages URL above.
[0xFF] LICENSE
┌──────────────────────────────────────────────────────────────┐ │ MIT LICENSE // UNAUTHORIZED REPRODUCTION WILL BE MET │ │ WITH FULL ICE │ └──────────────────────────────────────────────────────────────┘
░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░
░░ >>> JACK IN. MATCH THE PATTERN. EXECUTE THE ACTION. <<< ░░
░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░