bufjson. Fast streaming JSON parser and lexer | Process JSON without allocating or copying.
Get started
Add bufjson to your Cargo.toml or run $ cargo add bufjson.
Find simple getting started examples below, with further examples available in the API reference docs.
Features
- Streaming pull parser (lower level, does not "map" data into a data structure).
- Best in class speed, second only to
simd-json(but with more flexibility and features) - Minimizes allocations and data copying.
- Idiomatic, friendly, API with intuitive layered architecture.
- Clear structured error messages with pinpoint locations.
- Fast streaming JSON Pointer evaluation.
no_stdsupport.
Use cases
- Scan or parse JSON with minimal CPU and memory pressure.
- Handle arbitrary sized JSON text, essentially unlimited length streams supported with consistent high performance.
- Incrementally parse large documents in pieces as they become available (no big bang).
- Zero-copy network programming.
- Async JSON parsing.
- Handle concatenated JSON formats like JSONL, NDJSON, JSON Text Sequences (RFC 7464) and delimiter-free concatenated JSON.
Comparison to other crates
Click the links below to see how bufjson compares to other JSON parsing crates. Includes feature
comparisons and benchmark numbers.
Performance & benchmarks
The table below shows JSON text throughput benchmark results.1
| Component | .content() fetched |
Throughput |
|---|---|---|
FixedAnalyzer |
Never | 1.1 GiB/s |
FixedAnalyzer |
Always | 1.1 GiB/s |
Parser + FixedAnalyzer |
Never | 1 GiB/s |
Parser + FixedAnalyzer |
Always | 950 MiB/s |
PipeAnalyzer |
Never | 950 MiB/s |
PipeAnalyzer |
Always | 730 MiB/s |
ReadAnalyzer2 |
Never | 900 MiB/s |
ReadAnalyzer2 |
Always | 700 MiB/s |
Example
This example uses all layers of the bufjson stack (lexical analyzer, syntax parser, streaming JSON
Pointer evaluator) to redact designated paths from the JSON text, leaving everything else intact.
use bufjson::{
lexical::{Token, fixed::FixedAnalyzer},
pointer::{Evaluator, Event, Group, Pointer},
};
fn redact(input_json: &str, ptrs: &[&'static str]) -> String {
let parser = FixedAnalyzer::new(input_json.as_bytes()).into_parser();
let ptr_group = Group::from_pointers(ptrs.iter().map(|p| Pointer::from_static(p)));
let mut ev = Evaluator::new(parser, ptr_group, true /* expand escape seqs in object keys */);
let mut output_json = String::new();
loop {
let event = ev.next();
match event {
Event::Match(..) => { output_json.push_str(r#""***""#); continue; },
Event::Enter(..) => { output_json.push_str(r#""***""#); ev.next_end(); continue; },
_ => {},
}
match event.token() {
Token::Eof => return output_json,
Token::Err => panic!("{}", ev.err()),
_ => output_json.push_str(ev.content().literal()),
}
}
}
fn main() {
let r = redact(
r#"{"user": "alice", "ssn": "123-45-6789", "prefs": {"theme": "dark"}}"#,
&["/ssn", "/prefs"],
);
assert_eq!(r, r#"{"user": "alice", "ssn": "***", "prefs": "***"}"#);
}
A more sophisticated version of this example that streams its output with minimal allocation and
copying can be written using zero-copy Bytes structures and the PipeAnalyzer lexical analyzer.