rsaeb
rsaeb is a Rust 2024 no_std + alloc, byte-oriented interpreter for A=B
ordered rewrite programs.
A=B: https://store.steampowered.com/app/1720850/AB/
Unofficial Project Notice
This project is an unofficial, independently developed interpreter library. It is not affiliated with, endorsed by, or maintained by Artless Games or the original A=B author.
A=B's compact lhs=rhs ordered rewrite system is an unusually elegant
programming-puzzle idea. This crate exists because that design is worth
studying, testing, and reimplementing. If this interpreter interests you,
please support the original game.
Design boundary
The important split is deliberately strict: program source and runtime input are different byte domains.
- Program code is compact printable ASCII syntax.
- ASCII whitespace in program code is ignored before parsing.
#starts a comment for the rest of the source line.- Comments may contain non-ASCII or non-UTF-8 bytes.
- Executable code outside comments must be ASCII.
- Program payloads cannot contain whitespace,
=,#,(,), non-ASCII bytes, or ASCII control bytes. - Runtime input is ASCII data and may contain spaces, ASCII control bytes, and reserved syntax bytes.
- Normal rewrites preserve runtime-only bytes that program code cannot construct or match.
(return)stops execution and replaces the whole output with its return payload.
The crate intentionally contains no filesystem, process, stdout/stderr, argument parsing, environment access, or lossy display boundary. Those belong in a CLI or host application, not in the interpreter core.
no_std + alloc boundary
The library crate is #![no_std] and uses alloc for owned buffers such as
parsed rules, runtime input state, per-run (once) state, run results, and
trace snapshots. It requires an allocator, but not std.
Allocation is explicit and fallible. Parser/runtime paths reserve explicitly and
report AllocationError instead of relying on accidental Vec growth. Runtime
expansion is also budgeted through RunLimits; the runtime checks size limits
before allocating oversized states or return outputs. Trace snapshot
materialization is budgeted separately through an explicit
TraceSnapshotByteLimit. Owned public values that contain byte buffers
intentionally do not implement Clone; copying bytes is an explicit
materialization step, not a hidden infallible API.
A downstream std application can use the library normally. A downstream
no_std application must provide an allocator before calling APIs that allocate.
Basic usage
One-shot parse and run from UTF-8 source:
use ;
One-shot parse and run from raw source bytes:
use ;
Reusable parsed program with validated runtime input:
use ;
(once) consumption is runtime-local. Reusing Program is safe because parsed
programs are immutable; each run owns its own compact (once) state table. Only
(once) rules get state entries, so ordinary rules do not inflate runtime state
just by existing.
Parser behavior
The parser is byte-oriented. Comments are removed before executable-code validation, so comments can contain arbitrary non-ASCII bytes:
use ;
ASCII whitespace in program code is ignored, but spaces in runtime input are data:
use ;
The reserved syntax bytes =, #, (, and ) cannot appear in executable
payloads. They may appear in runtime input and will be preserved unless a rule
rewrites around them or (return) replaces the whole output.
Parse error columns are one-based byte positions in the original source line before whitespace compaction. Diagnostics point at the user's source text, not at the internal compacted representation.
Program format
A program source is a byte sequence containing one rewrite rule per non-empty code line:
lhs=rhs
Each line is parsed in this order:
#starts a comment. Everything from#to the end of the line is ignored.- Non-ASCII bytes are rejected in the remaining code part.
- ASCII whitespace in the code part is removed completely.
- Remaining non-whitespace code bytes must be printable ASCII.
- Empty compact code is ignored.
- Non-empty compact code must contain exactly one
=. - The left side and right side are parsed as compact rule syntax.
Internally, parser/runtime phases stay separate instead of passing a naked
Vec<u8> through every stage:
raw line bytes
-> RawSourceLine
-> CodeLine # comment removed, executable code ASCII validated
-> CompactCodeLine # whitespace removed, SourceColumn retained
-> NonEmptyCompactCodeLine # empty compact lines cannot enter rule parsing
-> RuleSyntaxLine # exactly one '=' has been proven
-> LeftSyntax / RightSyntax
-> ProgramByte # bytes that program code may construct and match
runtime input bytes
-> AsciiByte # runtime input domain validation
-> RuntimeByte # private Editable(ProgramByte) or Opaque(AsciiByte)
Program payloads are stored as ProgramByte, not raw u8. Runtime state is
stored as RuntimeByte: payload-compatible input and rule output become
editable program bytes, while whitespace, control bytes, and reserved syntax
bytes from input become opaque ASCII bytes. Ordinary rules match only editable bytes.
Opaque input bytes are preserved by surrounding rewrites but cannot be directly
matched, created, or deleted by program payloads. Runtime state is materialized
only at output boundaries, and the owned public values keep their purpose in the
type: stable states use RuntimeStateSnapshot, while (return) payloads use
ReturnOutput. During execution, the active state and the rewrite scratch
buffer are distinct typed buffers; the runtime swaps them only after a
successful continuation step, so a partially built rewrite cannot become the
committed state.
Examples:
a=b# this is parsed as a=b
#a=b this whole line is a comment
a b = b b # this is parsed as ab=bb
Non-ASCII text is allowed only in comments:
a=b# 日本語コメントは許可
This is invalid because the non-ASCII byte is in code:
a=あ
ASCII control bytes are invalid in executable code, except for ASCII whitespace that is removed during compaction. Runtime input is separate and may contain ASCII control bytes as data.
Reserved characters
The following characters are reserved in program code:
= # ( )
Their meanings are fixed:
=separates the left side from the right side.#starts a comment.(and)are only allowed as part of supported modifier/action tokens.
Internally, payload construction rejects all reserved syntax bytes at the
program-payload boundary. = and # are normally handled before payload
parsing, but they still cannot become payload data even if a future parser path
tries to feed them there. The implementation does not rely on “this should never
arrive here” as a safety boundary.
A second = in compact code is a parse error:
a=b=c
A second = inside a comment is ignored:
a=b#=c
Reserved syntax where payload data is expected is always a parse error:
a=b(
a=b)
a=b()
a=()
a=b(start)
a=(once)b
a(once)=b
Because whitespace is removed from program code, spaces cannot be represented as
rule data. Because =, #, (, and ) are reserved, program payloads also
refuse them as rule data.
Runtime input is different. Input bytes are runtime data, not program code. Input must be ASCII, but it may contain whitespace, ASCII control bytes, and reserved characters. Ordinary rewrite actions cannot match, create, or delete those bytes directly. The bytes themselves remain runtime data, although nearby editable bytes may be inserted, removed, or moved.
Example:
program: a=b
input: a=()#c
output: b=()#c
Rules also cannot match across preserved runtime-only bytes:
program: ab=bb
input: a bc
output: a bc
(return) is intentionally different from ordinary rewrite actions. It stops
execution and replaces the final output with the return payload, so runtime-only
input bytes are not preserved after a matching return rule:
program: a=(return)x
input: a=()#c
output: x
Left-side modifiers
The left side may start with one repeat modifier and one anchor modifier:
(once): the rule may be used at most once per runtime execution.(start): the rule only matches at the start of the current state.(end): the rule only matches at the end of the current state.
Supported modifier order is (once) first, then an optional anchor. Duplicated
or unsupported left-side modifier order is a parse error.
Examples:
a=b
(once)a=b
(start)a=b
(end)a=b
(once)(start)a=b
Because code whitespace is ignored, this is also valid and equivalent to
(once)(start)a=b:
( once ) ( start ) a = b
Right-side actions
The right side selects the action for a matching rule:
text: replace the matched left side withtext.(start)text: remove the match and inserttextat the start of the state.(end)text: remove the match and appendtextto the end of the state.(return)text: stop execution immediately and outputtext, discarding the current runtime state.
The action payload is still program data, so it cannot contain whitespace,
reserved characters, non-ASCII bytes, or ASCII control bytes. (return) can
therefore output only program-representable bytes, even if the discarded runtime
state contained spaces or reserved characters from the original input.
Examples:
a=b
x=(start)y
x=(end)y
x=(return)y
Empty sides
The left side and right side may be empty.
An empty right side deletes the matched left side:
a=
An empty left side matches an empty byte sequence. For unanchored rules and
(start) rules, it matches at the start of the current state:
(once)=x
With input ab, this inserts x at the start and produces xab.
For (end) rules, an empty left side matches at the end of the current state:
(once)(end)=x
With input ab, this inserts x at the end and produces abx.
An unanchored empty-left rule without (once), (return), or some later rule
that makes execution stop can rewrite forever until the step limit is reached.
That is legal syntax; execution remains governed by RunLimits.
Execution semantics
Execution is ordered and single-step.
On each step, the runtime scans rules from top to bottom and applies the first rule that matches the current state. For an unanchored non-empty left side, the leftmost match in the current state is used. After one rewrite step, scanning restarts from the first rule.
Example:
program:
aa=x
a=y
input:
aaaa
output:
xx
The first rule is preferred over the second rule, and each application rewrites
the leftmost matching aa.
Resource limits
RunLimits is the execution contract. Step count alone is not enough for a
rewrite system because a short run can still expand a state aggressively.
use ;
Execution may succeed exactly at the step limit. The step limit becomes an error only when another rule would still apply after the configured number of steps.
use ;
Rule inspection
Parsed rule inspection is structural. RuleView exposes repeat policy, anchor,
left payload, and right action directly. There is no stored compact_source
blob: canonical source is generated from the structured rule when requested.
This removes the second source of truth.
use ;
Tracing
Tracing has two layers.
Borrowed tracing is the allocation-free primitive. Events borrow the runtime state or return payload only for the callback invocation:
use ;
Trace snapshotting materializes state/output bytes into typed owned snapshots
under an explicit TraceSnapshotByteLimit. Step events still borrow RuleView
from the parsed Program, so retained trace snapshot events cannot outlive
that program:
use ;
Fallible borrowed sinks use try_run_with_borrowed_trace, which separates
runtime errors and trace-sink errors with TracedRunError. Snapshot tracing has
one more failure domain: run_with_trace_snapshots returns
TraceSnapshotRunError, and try_run_with_trace_snapshots returns
FallibleTraceSnapshotRunError so runtime failures, snapshot materialization
failures, and callback failures cannot collapse into one variant.
Error model
The library error model is intentionally split:
use ;
Allocation failures are structured:
use ;
State length arithmetic overflow is separate from allocation failure and is
reported as RunError::StateSize. Configured byte budgets and step budgets are
reported as RunError::Limit(LimitError::...). Step-limit errors report the
last state length, not the state bytes, so reporting the step limit cannot turn
into an allocation failure. Trace snapshot byte limits are reported through
TraceSnapshotError, not RunError::Limit, because snapshot materialization is
outside runtime execution. Use borrowed tracing when the last state bytes are
needed for diagnostics.
Filesystem failures are not part of the library error model. External I/O must
be handled before bytes enter Program::parse_bytes, Program::parse_str,
run_bytes, or run_str.
Public API overview
The generated rustdoc is the complete API reference. The main exported surface is grouped as follows.
Constants:
DEFAULT_MAX_STEPSDEFAULT_MAX_STATE_LENDEFAULT_MAX_RETURN_LENDEFAULT_MAX_TRACE_SNAPSHOT_LEN
Program construction and execution:
run_bytes(source, input, limits)run_str(source, input, limits)ProgramProgram::parse_bytes(source)Program::parse_str(source)Program::rule_count()Program::once_rule_count()Program::rules()Program::run(input, limits)Program::run_with_borrowed_trace(input, limits, callback)Program::try_run_with_borrowed_trace(input, limits, callback)Program::run_with_trace_snapshots(input, run_limits, trace_snapshot_limit, callback)Program::try_run_with_trace_snapshots(input, run_limits, trace_snapshot_limit, callback)
Runtime configuration and result:
RuntimeInput(parse(input),byte_count(),is_empty(),bytes())PayloadByteCount(new(value),get(),is_zero())RuntimeStateByteCount(new(value),get(),is_zero())ReturnOutputByteCount(new(value),get(),is_zero())TraceSnapshotByteCount(new(value),get(),is_zero())RunLimitsStepLimit(new(value),get())StateByteLimit(new(value),get())ReturnByteLimit(new(value),get())TraceSnapshotByteLimit(new(value),get())RunLimits::new(step_limit)RunLimits::bounded(step_limit, state_byte_limit, return_byte_limit)RunLimits::step_limit()RunLimits::state_byte_limit()RunLimits::return_byte_limit()RunLimits::with_step_limit(step_limit)RunLimits::with_state_byte_limit(state_byte_limit)RunLimits::with_return_byte_limit(return_byte_limit)RunResultRunResult::outcome()RunResult::into_outcome()RunResult::steps()RunOutcomeRuntimeStateSnapshot(as_bytes(),into_vec(),byte_count(),is_empty())ReturnOutput(as_bytes(),into_vec(),byte_count(),is_empty())StepCount(get())
Rule data:
RulePosition(number())RuleNumber(get())RuleCount(new(value),get())RuleRepeatRuleAnchorPayloadView<'program>(byte_count(),is_empty(),bytes(),eq_bytes(expected),to_vec())RuleActionView<'program>RuleView<'program>RuleView::position()RuleView::line_number()RuleView::repeat()RuleView::anchor()RuleView::lhs()RuleView::action()RuleView::canonical_source()
Tracing:
RuntimeStateView<'run>(is_empty(),bytes(),byte_count(),to_vec())BorrowedTraceEvent<'program, 'run>BorrowedTraceEffect<'program, 'run>TraceSnapshotEvent<'program>TraceSnapshotEffectBorrowedTraceEvent::byte_count()BorrowedTraceEvent::is_empty()BorrowedTraceEvent::to_snapshot(trace_snapshot_limit)BorrowedTraceEffect::byte_count()BorrowedTraceEffect::is_empty()TracedRunError<E>TraceSnapshotErrorTraceSnapshotRunErrorFallibleTraceSnapshotRunError<E>
Errors:
AebErrorParseErrorParseError::location()ParseError::line()ParseError::kind()ParseErrorKindParseErrorLocationSourceLineNumber(get())SourceColumn(get())SourcePosition(line(),column())NonAsciiCodeByte(get())NonPrintableCodeByte(get())NonAsciiInputByte(get())ReservedSyntaxByte(get())PayloadKindLeftModifierKindRightActionKindRunErrorInputColumn(get())InputErrorTraceSnapshotErrorTraceSnapshotRunErrorFallibleTraceSnapshotRunError<E>AllocationError(context(),kind(),requested_capacity())AllocationErrorKindAllocationContextStateSizeError(state_len(),lhs_len(),rhs_len())LimitErrorStateLimitContext