json-escape
A highly ergonomic, well-tested, no_std
library for streaming JSON string escaping and unescaping. It processes JSON strings with zero-copy slicing and no intermediate allocations, ensuring both high performance and RFC-compliant correctness, ideal for parsers, I/O operations, and memory-constrained environments. ✅
The library provides a layered API to fit your needs:
- High-Level Iterators: The
escape_str
andunescape
functions provide a simple, iterator-based way to process entire string slices. - Streaming for I/O: The
stream::UnescapeStream
struct processes data in chunks, perfect for reading from files or network sockets. - Low-Level Tokens: The
token
module offers the most granular control for building custom processors.
This multi-faceted approach avoids allocating a single large buffer for the result, making it incredibly efficient for large data processing.
Key Features
- 🎯 Ergonomic & Intuitive: Simple-to-use functions that integrate cleanly with Rust's ecosystem.
- 🚀 True Streaming for I/O: The
stream
module handles chunked data from any source, correctly parsing escape sequences that are split across buffer boundaries. - ✨ Zero-Copy Slicing: For sequences that don't need modification, the iterators yield slices borrowed directly from the input.
- ✅ Correct & Compliant: Fully compliant with RFC 8259 for all escape sequences, including full support for UTF-16 surrogate pairs (
\uD83D\uDE00
for 😀). - 🧩
no_std
Compatible: Usable in embedded systems and other memory-constrained environments (with thealloc
feature for owned conversions). - ⚙️ Full Functionality: Implements
PartialEq
for convenient testing and providesstd::io::Read
integration (with thestd
feature) for plugging directly into libraries likeserde_json
.
Quick Start
Escaping a String
use escape_str;
use Cow;
let input = "Hello, \"world\"!\nThis is a backslash: \\";
let expected = r#"Hello, \"world\"!\nThis is a backslash: \\"#;
// escape_str returns an iterator. Collect it into a String.
let escaped_string: String = escape_str.collect;
assert_eq!;
Unescaping a String Slice
use unescape;
use Cow;
let input = r#"Emoji: \uD83D\uDE00 and a tab\t!"#;
let expected = "Emoji: 😀 and a tab\t!";
// unescape returns an iterator over Result<&[u8], _>.
// The `decode_utf8` helper collects and validates the output.
let decoded_cow: = unescape.decode_utf8.unwrap;
assert_eq!;
Streaming Unescape for I/O 🚀
The most powerful feature is the ability to unescape a stream of data chunks without buffering them. The stream::UnescapeStream
struct is designed for this purpose.
You "push" byte slices into the unescaper as you receive them (e.g., from a file or network socket). It correctly handles complex escape sequences, like surrogate pairs, that might be split across chunks.
use ;
// A JSON string split into multiple parts.
// The surrogate pair `\uD83D\uDE00` (😀) is split across the boundary.
let parts = vec!;
let mut unescaper = new;
let mut unescaped_string = String new;
for part in parts
// IMPORTANT: Always call finish() to detect errors at the end of the stream.
unescaper.finish.unwrap;
assert_eq!;
println!;
Performance
The library's design focuses on minimizing allocations and maximizing throughput. A SWAR-based (SIMD Within a Register) algorithm makes scanning for escapes nearly free, but the biggest advantage comes from the UnescapeStream
API for I/O tasks.
True Streaming Performance: UnescapeStream
vs. Buffering
To quantify the advantage of true streaming, we benchmarked UnescapeStream
against the traditional approach of collecting all I/O chunks into a single buffer before unescaping.
The results are clear: for any realistic I/O, the streaming API is significantly faster and more memory-efficient.
Workload | Chunk Size | Performance Advantage (Streaming vs. Buffering) |
---|---|---|
Dense & Unicode Escapes | All Sizes | 🚀 Up to 5× faster |
Sparse Escapes | All Sizes | ✅ Up to 2.2× faster |
No Escapes (Ideal I/O) | Typical (≥1KB) | 👍 1.7× faster |
Why is streaming so much faster?
- Single-Pass Processing: The streaming API processes data as it arrives. It avoids the massive overhead of the "collect-then-process" model, which must first perform a full memory copy of the entire dataset into a new buffer before it can even begin unescaping.
- Immediate Output: This single-pass architecture means work gets done sooner, which can lead to lower latency in interactive applications.
- Lower Memory Footprint: While not measured here, the streaming approach uses constant, minimal memory (just a tiny internal buffer), whereas the buffering method requires enough memory to hold the entire dataset at once.
The only scenario where buffering has a slight edge is with trivial data (no escapes) and unrealistically small chunks (e.g., 64 bytes), where the overhead of repeated function calls outweighs the memory copy cost. For any typical I/O pattern, UnescapeStream
is the superior choice.
Low-Level APIs (token
and explicit
)
For advanced use cases, the token
and explicit
modules provide more granular control.
json_escape::token
: The lowest-level API. It yieldsUnescapedToken
andEscapedToken
enums, which separate literal slices from processed characters. This is the most flexible and composable API, ideal for building custom state machines or processors.json_escape::explicit
: A slightly higher-level API that yields chunk structs (UnescapedChunk
,EscapedChunk
). These structs provide methods to inspect literal and processed parts, which is useful for debugging and logging.
Examples
You can find complete, runnable examples in the examples/
directory of the project repository. These are great for copy-pasting and learning how to use the library in different scenarios:
simple_unescape.rs
: Basic usage of the high-levelunescape
iterator.stream_file.rs
: A practical example of usingUnescapeStream
to read and process a file.zero_copy_serde.rs
: Demonstrates how to parse a JSON field containing an escaped JSON string directly into aserde
struct without intermediate allocations.
Changelog
This project follows Keep a Changelog. All notable changes, including new features, bug fixes, and performance improvements, are documented in the CHANGELOG.md
file. We encourage users to review it for transparency between releases.
Installation
Add this to your Cargo.toml
:
[]
= "0.3.0"
Feature Flags
alloc
(enabled by default): ProvidesCow
,String
, andVec
conversions.std
(enabled by default): Providesstd::io::Read
andstd::error::Error
implementations.
For no_std
environments without an allocator, use:
[]
= { = "0.3.0", = false }
License
This project is licensed under either of
- Apache License, Version 2.0, (LICENSE-APACHE)
- MIT license (LICENSE-MIT)
at your option.