asmjson
A fast JSON parser that classifies 64 bytes at a time using SIMD or portable SWAR (SIMD-Within-A-Register) bit tricks, enabling entire whitespace runs and string bodies to be skipped in a single operation.
Quick start
use ;
let classify = choose_classifier; // picks best for the current CPU
let tape = parse_to_tape.unwrap;
assert_eq!;
assert_eq!;
For repeated parses, store the result of choose_classifier in a static once
cell or pass it through your application rather than calling it on every parse.
Optimisation tips
TapeRef is a plain Copy cursor — two usizes — so it is cheap to store
and reuse. Holding on to a TapeRef you have already located lets you skip
re-scanning work on subsequent accesses.
Cache field refs from a one-pass object scan
get(key) walks the object from the start every time it is called. If you
need several fields from the same object, iterate once with object_iter and
keep the values you care about:
use ;
let classify = choose_classifier;
let src = r#"{"items":[1,2,3],"meta":{"count":3}}"#;
let tape = parse_to_tape.unwrap;
let root = tape.root.unwrap;
// Single pass — O(n_keys) regardless of how many fields we need.
let mut items_ref: = None;
let mut meta_ref: = None;
for in root.object_iter.unwrap
// Subsequent accesses go straight to the cached position — no re-scan.
let count = meta_ref.unwrap.get.unwrap.as_i64;
assert_eq!;
Collect array elements for indexed or multi-pass access
array_iter yields each element once in document order. Collecting the
results into a Vec<TapeRef> gives you random access and any number of
further passes at zero additional parsing cost:
use ;
let classify = choose_classifier;
let src = r#"[{"name":"Alice","score":91},{"name":"Bob","score":78},{"name":"Carol","score":85}]"#;
let tape = parse_to_tape.unwrap;
let root = tape.root.unwrap;
// Collect once — O(n) scan.
let rows: = root.array_iter.unwrap.collect;
// Random access is now O(1) — no re-scanning.
assert_eq!;
// Multiple passes over the same rows are free.
let total: i64 = rows.iter
.filter_map
.sum;
assert_eq!;
Output formats
parse_to_tape— allocates a flatTapeof tokens with O(1) structural skips.parse_with— drives a customJsonWritersink; zero extra allocation.
Classifiers
The classifier is a plain function pointer that labels 64 bytes at a time. Three are provided:
| Classifier | ISA | Speed |
|---|---|---|
classify_zmm |
AVX-512BW | fastest |
classify_ymm |
AVX2 | fast |
classify_u64 |
portable SWAR | good |
Use choose_classifier to select automatically at runtime.
Benchmarks
Measured on a single core with cargo bench against 10 MiB of synthetic JSON.
Comparison point is sonic-rs (lazy Value, AVX2).
Each benchmark measures parse + full traversal: after parsing, every string value and object key is visited and its length accumulated. This is necessary for a fair comparison because sonic-rs defers decoding string content until the value is accessed (lazy evaluation); a parse-only measurement would undercount its work relative to any real use-case where the parsed data is actually read.
| Parser | string array | string object | mixed |
|---|---|---|---|
| asmjson zmm tape | 10.81 GiB/s | 7.15 GiB/s | 905 MiB/s |
| asmjson zmm | 8.64 GiB/s | 6.27 GiB/s | 672 MiB/s |
| sonic-rs | 7.11 GiB/s | 4.04 GiB/s | 475 MiB/s |
| asmjson u64 | 7.10 GiB/s | 4.93 GiB/s | 636 MiB/s |
| serde_json | 2.43 GiB/s | 535 MiB/s | 83 MiB/s |
asmjson zmm tape leads across all three workloads. It writes a flat
TapeEntry array in the assembly parser itself — one pointer-sized entry per
value — so structural traversal is a single linear scan with no pointer
chasing. The baseline asmjson zmm parser also leads on string-dominated
workloads; the portable u64 SWAR classifier is neck-and-neck with sonic-rs
on string arrays despite using no SIMD instructions, and beats it on string
objects. sonic-rs narrows the gap on mixed JSON through its lazy string
decoding, but zmm tape still leads by 90 %.
Conformance note
asmjson is slightly permissive: its classifier treats any byte with value
< 0x20 (i.e. all C0 control characters) as whitespace, rather than
strictly the four characters the JSON specification allows (0x09 HT, 0x0A
LF, 0x0D CR, 0x20 SP). Well-formed JSON is parsed identically; input
that embeds bare control characters other than the four legal ones will be
accepted where a strict parser would reject it.
License
MIT — see LICENSE.
For internals documentation (state machine annotation, register allocation, design decisions) see doc/dev.md.