bufjson. A low-level, low-allocation, low-copy JSON tokenizer and parser geared toward
efficient stream processing at scale.
Get started
Add bufjson to your Cargo.toml or run $ cargo add bufjson.
Here's a simple example that parses a JSON text for syntax validity and prints it with the insignificant whitespace stripped out.
use ;
Architecture
The bufjson crate provides a stream-oriented JSON tokenizer through the lexical::Analyzer trait,
with these implementations:
FixedAnalyzertokenizes fixed-size buffers;ReadAnalyzertokenizes sync input streams implementingio::Read; andAsyncAnalyzertokenizes async streams that yield byte buffers (COMING SOON-ISH);
The remainder of the library builds on the lexical analyzer trait.
- The
syntaxmodule provides concrete stream-oriented parser types that can wrap any lexical analyzer. - The
pointermodule enables fast stream-oriented evaluation of JSON Pointers.
Refer to the API reference docs for more detail.
When to use
Choose bufjson when you need to:
- Control and limit allocations or copying.
- Process JSON text larger than available memory.
- Extract specific values without parsing an entire JSON text.
- Edit a stream of JSON tokens (add/remove/change values in the stream).
- Access token content exactly as it appears in the JSON text (e.g. without unescaping strings).
- Protect against malicious or degenerate inputs.
- Implement custom parsing with precise behavior control.
Other libraries are more suitable for:
- Deserializing JSON text straight into in-memory data structures (use
serde_jsonorsimd-json). - Serializing in-memory data structures to JSON (use
serde_json). - Writing JSON text in a stream-oriented manner (use
serde_jsonorjson-writer).
Performance
- Zero-copy string processing where possible.
- Minimal allocations, which are explicit and optional wherever possible.
- Streaming design handles arbitrarily long JSON text without loading into memory.
- Suitable for high-throughput applications.
Benchmarks
The table below shows JSON text throughput benchmark results.1
| Component | .content() fetched |
Throughput |
|---|---|---|
FixedAnalyzer |
Never | 1 GiB/s |
FixedAnalyzer |
Always | 1 GiB/s |
ReadAnalyzer2 |
Never | 880 MiB/s |
ReadAnalyzer2 |
Always | 690 MiB/s |
Parser + FixedAnalyzer |
Never | 890 MiB/s |
Parser + FixedAnalyzer |
Always | 850 MiB/s |