# asmjson
[](https://github.com/andy-thomason/asmjson/actions/workflows/ci.yml)
[](https://crates.io/crates/asmjson)
[](https://docs.rs/asmjson)
A fast JSON parser that classifies 64 bytes at a time using SIMD or portable
SWAR (SIMD-Within-A-Register) bit tricks, enabling entire whitespace runs and
string bodies to be skipped in a single operation.
## Quick start
```rust
use asmjson::{parse_to_tape, choose_classifier, JsonRef};
let classify = choose_classifier(); // picks best for the current CPU
let tape = parse_to_tape(r#"{"name":"Alice","age":30}"#, classify).unwrap();
assert_eq!(tape.root().get("name").as_str(), Some("Alice"));
assert_eq!(tape.root().get("age").as_i64(), Some(30));
```
For repeated parses, store the result of `choose_classifier` in a static once
cell or pass it through your application rather than calling it on every parse.
## Optimisation tips
`TapeRef` is a plain `Copy` cursor — two `usize`s — so it is cheap to store
and reuse. Holding on to a `TapeRef` you have already located lets you skip
re-scanning work on subsequent accesses.
### Cache field refs from a one-pass object scan
`get(key)` walks the object from the start every time it is called. If you
need several fields from the same object, iterate once with `object_iter` and
keep the values you care about:
```rust
use asmjson::{parse_to_tape, choose_classifier, JsonRef, TapeRef};
let classify = choose_classifier();
let src = r#"{"items":[1,2,3],"meta":{"count":3}}"#;
let tape = parse_to_tape(src, classify).unwrap();
let root = tape.root().unwrap();
// Single pass — O(n_keys) regardless of how many fields we need.
let mut items_ref: Option<TapeRef> = None;
let mut meta_ref: Option<TapeRef> = None;
for (key, val) in root.object_iter().unwrap() {
match key {
"items" => items_ref = Some(val),
"meta" => meta_ref = Some(val),
_ => {}
}
}
// Subsequent accesses go straight to the cached position — no re-scan.
let count = meta_ref.unwrap().get("count").unwrap().as_i64();
assert_eq!(count, Some(3));
```
### Collect array elements for indexed or multi-pass access
`array_iter` yields each element once in document order. Collecting the
results into a `Vec<TapeRef>` gives you random access and any number of
further passes at zero additional parsing cost:
```rust
use asmjson::{parse_to_tape, choose_classifier, JsonRef, TapeRef};
let classify = choose_classifier();
let src = r#"[{"name":"Alice","score":91},{"name":"Bob","score":78},{"name":"Carol","score":85}]"#;
let tape = parse_to_tape(src, classify).unwrap();
let root = tape.root().unwrap();
// Collect once — O(n) scan.
let rows: Vec<TapeRef> = root.array_iter().unwrap().collect();
// Random access is now O(1) — no re-scanning.
assert_eq!(rows[1].get("name").unwrap().as_str(), Some("Bob"));
// Multiple passes over the same rows are free.
let total: i64 = rows.iter()
.filter_map(|r| r.get("score").and_then(|s| s.as_i64()))
.sum();
assert_eq!(total, 91 + 78 + 85);
```
## Output formats
- `parse_to_tape` — allocates a flat `Tape` of tokens with O(1) structural skips.
- `parse_with` — drives a custom `JsonWriter` sink; zero extra allocation.
## Classifiers
The classifier is a plain function pointer that labels 64 bytes at a time.
Three are provided:
| `classify_zmm` | AVX-512BW | fastest |
| `classify_ymm` | AVX2 | fast |
| `classify_u64` | portable SWAR | good |
Use `choose_classifier` to select automatically at runtime.
## Benchmarks
Measured on a single core with `cargo bench` against 10 MiB of synthetic JSON.
Comparison point is `sonic-rs` (lazy Value, AVX2).
Each benchmark measures **parse + full traversal**: after parsing, every string
value and object key is visited and its length accumulated. This is necessary
for a fair comparison because sonic-rs defers decoding string content until the
value is accessed (lazy evaluation); a parse-only measurement would undercount
its work relative to any real use-case where the parsed data is actually read.
| asmjson zmm tape | 10.81 GiB/s | 7.15 GiB/s | 905 MiB/s |
| asmjson zmm | 8.64 GiB/s | 6.27 GiB/s | 672 MiB/s |
| sonic-rs | 7.11 GiB/s | 4.04 GiB/s | 475 MiB/s |
| asmjson u64 | 7.10 GiB/s | 4.93 GiB/s | 636 MiB/s |
| serde_json | 2.43 GiB/s | 535 MiB/s | 83 MiB/s |
asmjson zmm tape leads across all three workloads. It writes a flat
`TapeEntry` array in the assembly parser itself — one pointer-sized entry per
value — so structural traversal is a single linear scan with no pointer
chasing. The baseline asmjson zmm parser also leads on string-dominated
workloads; the portable `u64` SWAR classifier is neck-and-neck with sonic-rs
on string arrays despite using no SIMD instructions, and beats it on string
objects. sonic-rs narrows the gap on mixed JSON through its lazy string
decoding, but zmm tape still leads by 90 %.
## Conformance note
asmjson is slightly permissive: its classifier treats **any byte with value
`< 0x20`** (i.e. all C0 control characters) as whitespace, rather than
strictly the four characters the JSON specification allows (`0x09` HT, `0x0A`
LF, `0x0D` CR, `0x20` SP). Well-formed JSON is parsed identically; input
that embeds bare control characters other than the four legal ones will be
accepted where a strict parser would reject it.
## License
MIT — see [LICENSE](https://github.com/andy-thomason/asmjson/blob/master/LICENSE).
For internals documentation (state machine annotation, register allocation,
design decisions) see [doc/dev.md](doc/dev.md).