# compress-json-rs
[](https://crates.io/crates/compress-json-rs)
[](https://docs.rs/compress-json-rs)
[](LICENSE)
AI-driven Rust port of the JavaScript [compress-json](https://github.com/beenotung/compress-json) library by Beenotung.
Store JSON data in a space-efficient compressed form with lossless round-trip compression and decompression.
## Table of Contents
- [Features](#features)
- [Installation](#installation)
- [Quick Start](#quick-start)
- [Usage Examples](#usage-examples)
- [API Reference](#api-reference)
- [Special Values](#special-values)
- [How It Works](#how-it-works)
- [Architecture](#architecture)
- [Compression Format](#compression-format)
- [Configuration](#configuration)
- [Helper Functions](#helper-functions)
- [Performance Considerations](#performance-considerations)
- [License](#license)
## Features
- **Full JSON Support**: Objects, arrays, strings, numbers, booleans, and null
- **Special Values Support**: Infinity, -Infinity, and NaN with dedicated encodings (v3.2.0+)
- **Value Deduplication**: Repeated values stored once with reference keys
- **Schema Deduplication**: Objects with identical keys share schemas
- **Compact Encoding**: Numbers encoded in base-62 format
- **Type Safety**: Zero-copy round-trip using `serde_json::Value`
- **UTF-8 Safe**: Full Unicode support for strings
- **No Dependencies on Disk/Network**: Fast in-memory compression
- **Cross-Platform Compatible**: Data compatible with JavaScript and Python implementations
## Installation
Add to your `Cargo.toml`:
```toml
[dependencies]
compress-json-rs = "0.1.0"
serde_json = "1.0"
```
## Quick Start
```rust
use compress_json_rs::{compress, decompress};
use serde_json::json;
fn main() {
// Original JSON data
let data = json!({
"user": "Alice",
"active": true,
"roles": ["admin", "user"]
});
// Compress
let compressed = compress(&data);
// Decompress
let restored = decompress(compressed);
assert_eq!(data, restored);
}
```
## Usage Examples
### Basic Object Compression
```rust
use compress_json_rs::{compress, decompress};
use serde_json::json;
let user = json!({
"id": 12345,
"name": "Alice",
"email": "alice@example.com",
"settings": {
"theme": "dark",
"notifications": true
}
});
let (values, root) = compress(&user);
// The compressed form is a tuple of:
// - values: Vec<String> - deduplicated value store
// - root: String - key pointing to the root value
println!("Values: {:?}", values);
println!("Root key: {}", root);
// Restore original
let restored = decompress((values, root));
assert_eq!(user, restored);
```
### Array with Repeated Objects
```rust
use compress_json_rs::{compress, decompress};
use serde_json::json;
// Arrays of objects with similar schemas benefit most from compression
let data = json!([
{ "type": "fruit", "name": "apple", "color": "red" },
{ "type": "fruit", "name": "banana", "color": "yellow" },
{ "type": "fruit", "name": "cherry", "color": "red" },
]);
let compressed = compress(&data);
let restored = decompress(compressed);
assert_eq!(data, restored);
```
### Serialization for Storage/Transmission
```rust
use compress_json_rs::{compress, decompress, Compressed};
use serde_json::json;
let data = json!({
"items": [
{ "id": 1, "status": "active" },
{ "id": 2, "status": "active" },
{ "id": 3, "status": "pending" }
]
});
// Compress
let compressed = compress(&data);
// Serialize to JSON string for storage
let json_str = serde_json::to_string(&compressed).unwrap();
println!("Compressed JSON: {}", json_str);
// Later: deserialize and decompress
let loaded: Compressed = serde_json::from_str(&json_str).unwrap();
let restored = decompress(loaded);
assert_eq!(data, restored);
```
### Working with Files
```rust
use compress_json_rs::{compress, decompress, Compressed};
use serde_json::json;
use std::fs;
fn save_compressed(data: &serde_json::Value, path: &str) -> std::io::Result<()> {
let compressed = compress(data);
let json = serde_json::to_string(&compressed)?;
fs::write(path, json)
}
fn load_compressed(path: &str) -> std::io::Result<serde_json::Value> {
let json = fs::read_to_string(path)?;
let compressed: Compressed = serde_json::from_str(&json)?;
Ok(decompress(compressed))
}
// Usage
let data = json!({"key": "value"});
save_compressed(&data, "data.compressed.json").unwrap();
let restored = load_compressed("data.compressed.json").unwrap();
```
### Using Helper Functions
```rust
use compress_json_rs::{trim_undefined, trim_undefined_recursively};
use serde_json::{json, Map, Value};
// Remove null values from objects before compression
let mut data: Map<String, Value> = serde_json::from_value(json!({
"name": "Alice",
"middleName": null, // Will be removed
"age": 30
})).unwrap();
trim_undefined(&mut data);
// data now only contains "name" and "age"
// Recursively remove nulls from nested objects
let mut nested: Map<String, Value> = serde_json::from_value(json!({
"user": {
"name": "Bob",
"extra": null // Will be removed
},
"meta": null // Will be removed
})).unwrap();
trim_undefined_recursively(&mut nested);
```
## API Reference
### Core Functions
```rust
/// Compressed representation: (values array, root key)
pub type Compressed = (Vec<String>, String);
/// Key type for value references
pub type Key = String;
/// Compress a JSON value into its compressed form
pub fn compress(o: &serde_json::Value) -> Compressed;
/// Decompress a compressed form back into JSON
pub fn decompress(c: Compressed) -> serde_json::Value;
/// Decode a single key from the values array
pub fn decode(values: &Vec<String>, key: &str) -> serde_json::Value;
```
### Lower-Level API
```rust
/// Memory structure for compression state
pub struct Memory { /* internal */ }
/// Create a new memory instance for compression
pub fn make_memory() -> Memory;
/// Add a value to memory, returns its reference key
pub fn add_value(mem: &mut Memory, o: &serde_json::Value) -> Key;
/// Convert memory to the values array
pub fn mem_to_values(mem: &Memory) -> Vec<String>;
```
### Helper Functions
```rust
/// Remove keys with null values from an object (shallow)
pub fn trim_undefined(object: &mut Map<String, Value>);
/// Recursively remove keys with null values from nested objects
pub fn trim_undefined_recursively(object: &mut Map<String, Value>);
```
### Configuration
```rust
/// Global configuration for compression behavior
pub struct Config {
/// Whether to sort object keys (default: false)
pub sort_key: bool,
/// Whether to error on NaN values (default: false)
pub error_on_nan: bool,
/// Whether to error on infinite values (default: false)
pub error_on_infinite: bool,
}
pub const CONFIG: Config;
```
## Special Values
*Updated in v3.4.0 with preserve options*
The library supports special floating-point values that are not part of the JSON specification.
### Default Behavior (v3.4.0+)
By default, `preserve_nan` and `preserve_infinite` are `false`, so special values become `null` (like `JSON.stringify`):
| `Infinity` | `null` | `N\|+` |
| `-Infinity` | `null` | `N\|-` |
| `NaN` | `null` | `N\|0` |
### Config Options
```rust
use compress_json_rs::CONFIG;
// Default configuration (v3.4.0+)
// preserve_nan: false - NaN becomes null
// preserve_infinite: false - Infinity becomes null
// error_on_nan: false - Don't panic on NaN
// error_on_infinite: false - Don't panic on Infinity
```
### Decoding Special Values
When receiving data from JavaScript/Python implementations that have `preserve_*` enabled:
```rust
use compress_json_rs::{decode_special, is_special_value};
// Check if a value is special
assert!(is_special_value("N|0"));
// Decode special values back to f64
let inf = decode_special("N|+");
assert!(inf.is_infinite() && inf.is_sign_positive());
let nan = decode_special("N|0");
assert!(nan.is_nan());
```
### JSON Compatibility Note
Since JSON doesn't natively support `Infinity`, `-Infinity`, or `NaN`, these values become `null` when decompressed to `serde_json::Value`. However, when `preserve_*` options are enabled, the **compressed format preserves** the original special values, enabling:
- Cross-platform data exchange with JavaScript and Python implementations
- Lossless storage of special values in the compressed form
- Re-encoding with preserved semantics
### String Escaping
Strings that look like special value encodings are automatically escaped:
```rust
use compress_json_rs::{compress, decompress};
use serde_json::json;
let compressed = compress(&data);
let restored = decompress(compressed);
assert_eq!(restored["value"], "N|+");
```
## How It Works
The compression algorithm works by **deduplicating values** and **encoding references** using base-62 keys.
### Compression Flow
```mermaid
flowchart TD
subgraph Input
A[JSON Value]
end
subgraph Compression Process
B[Create Memory Store]
C{Value Type?}
D[Encode Boolean]
E[Encode Number]
F[Encode String]
G[Process Array]
H[Process Object]
I[Check Value Cache]
J{Cached?}
K[Return Existing Key]
L[Generate New Key]
M[Store Value]
end
subgraph Output
N[Compressed Tuple]
O["(Vec<String>, Key)"]
end
A --> B
B --> C
C -->|bool| D
C -->|number| E
C -->|string| F
C -->|array| G
C -->|object| H
D & E & F --> I
G --> |each element| C
H --> |schema + values| C
I --> J
J -->|yes| K
J -->|no| L
L --> M
M --> K
K --> N
N --> O
```
### Decompression Flow
```mermaid
flowchart TD
subgraph Input
A["Compressed (values, root)"]
end
subgraph Decompression Process
B[Parse Root Key]
C[Lookup Value in Store]
D{Value Prefix?}
E[Decode Boolean]
F[Decode Number]
G[Decode String]
H[Decode Array]
I[Decode Object]
J[Recursive Decode]
end
subgraph Output
K[JSON Value]
end
A --> B
B --> C
C --> D
D -->|"b|"| E
D -->|"n|"| F
D -->|"s|" or none| G
D -->|"a|"| H
D -->|"o|"| I
H --> J
I --> J
J --> C
E & F & G --> K
H & I --> K
```
## Architecture
### Module Structure
```mermaid
graph TB
subgraph Public API
LIB[lib.rs]
end
subgraph Core Modules
CORE[core.rs<br/>compress/decompress]
MEM[memory.rs<br/>value storage]
ENC[encode.rs<br/>type encoding]
end
subgraph Support Modules
NUM[number.rs<br/>base-62 conversion]
BOOL[boolean.rs<br/>bool encoding]
HELP[helpers.rs<br/>utility functions]
CFG[config.rs<br/>configuration]
DBG[debug.rs<br/>error handling]
end
LIB --> CORE
LIB --> MEM
LIB --> HELP
LIB --> CFG
CORE --> MEM
CORE --> ENC
MEM --> ENC
MEM --> NUM
MEM --> CFG
MEM --> DBG
ENC --> NUM
ENC --> BOOL
```
### Memory Structure
```mermaid
classDiagram
class Memory {
-Vec~String~ store
-HashMap~String, String~ value_cache
-HashMap~String, String~ schema_cache
-usize key_count
}
class Compressed {
+Vec~String~ values
+String root_key
}
Memory --> Compressed : produces
note for Memory "Stores encoded values with<br/>deduplication via caches"
note for Compressed "Final output format:<br/>(values, root)"
```
## Compression Format
### Value Encoding Prefixes
| `b\|T` | Boolean true | `b\|T` | `true` |
| `b\|F` | Boolean false | `b\|F` | `false` |
| `n\|` | Number | `n\|42.5` | `42.5` |
| `N\|+` | Infinity | `N\|+` | `Infinity` |
| `N\|-` | -Infinity | `N\|-` | `-Infinity` |
| `N\|0` | NaN | `N\|0` | `NaN` |
| `s\|` | Escaped string | `s\|n\|123` | `"n\|123"` |
| `a\|` | Array | `a\|0\|1\|2` | Array with refs 0,1,2 |
| `o\|` | Object | `o\|0\|1\|2` | Object with schema ref |
| (none) | Plain string | `hello` | `"hello"` |
| `""` or `_` | Null | `` | `null` |
### Key Encoding (Base-62)
Keys are encoded using base-62 for compact representation:
```
Characters: 0-9 A-Z a-z (62 total)
Examples:
0 -> "0"
9 -> "9"
10 -> "A"
35 -> "Z"
36 -> "a"
61 -> "z"
62 -> "10"
124 -> "20"
```
### Example Compression
```mermaid
graph LR
subgraph Original JSON
A["{
'name': 'Alice',
'role': 'admin'
}"]
end
subgraph Compressed Values Array
B["0: 'name,role' (schema)
1: 'Alice'
2: 'admin'
subgraph Compressed Output
C["(['name,role', 'Alice',
A --> B
B --> C
```
### Schema Sharing Example
```mermaid
graph TD
subgraph "Input: Array of Objects"
A["[
{ id: 1, type: 'A' },
{ id: 2, type: 'B' },
{ id: 3, type: 'A' }
]"]
end
subgraph "Compressed Values"
B["0: 'a|id,type' // shared schema
3: 'o|0|1|2' // obj 1
4: 'n|2'
5: 'B'
6: 'o|0|4|5' // obj 2
7: 'n|3'
8: 'o|0|7|2' // obj 3 (reuses 'A')
9: 'a|3|6|8' // root array"]
end
subgraph Benefits
C["✓ Schema 'id,type' stored once
✓ Value 'A' stored once
✓ Minimal storage for repetitive data"]
end
A --> B
B --> C
```
## Configuration
The library uses a compile-time configuration:
```rust
pub const CONFIG: Config = Config {
sort_key: false, // Don't sort object keys
preserve_nan: false, // Convert NaN to null (like JSON.stringify)
error_on_nan: false, // Don't panic on NaN (only when preserve_nan=false)
preserve_infinite: false, // Convert Infinity to null (like JSON.stringify)
error_on_infinite: false, // Don't panic on Infinity (only when preserve_infinite=false)
};
```
### Behavior Notes (v3.4.0+)
- **NaN and Infinity handling** depends on config options:
| Value | `preserve_*` = true | `preserve_*` = false, `error_*` = true | Both false (default) |
|-------|---------------------|----------------------------------------|----------------------|
| NaN | Encoded as `N\|0` | Panic | Becomes `null` |
| Infinity | Encoded as `N\|+` | Panic | Becomes `null` |
| -Infinity | Encoded as `N\|-` | Panic | Becomes `null` |
- **Key Order**: Object keys maintain insertion order unless `sort_key` is enabled
- **Unicode**: Full UTF-8 support for all string values
## Helper Functions
### trim_undefined
Removes keys with null values from an object (shallow operation):
```rust
use compress_json_rs::trim_undefined;
use serde_json::{json, Map, Value};
let mut obj: Map<String, Value> = serde_json::from_value(json!({
"a": 1,
"b": null,
"c": 3
})).unwrap();
trim_undefined(&mut obj);
// obj = { "a": 1, "c": 3 }
```
### trim_undefined_recursively
Removes null values from nested objects:
```rust
use compress_json_rs::trim_undefined_recursively;
use serde_json::{json, Map, Value};
let mut obj: Map<String, Value> = serde_json::from_value(json!({
"user": {
"name": "Alice",
"middleName": null
},
"extra": null
})).unwrap();
trim_undefined_recursively(&mut obj);
// obj = { "user": { "name": "Alice" } }
```
## Performance Considerations
### Best Use Cases
```mermaid
graph LR
subgraph "High Compression Ratio"
A[Arrays of similar objects]
B[Repeated string values]
C[Nested objects with shared schemas]
end
subgraph "Lower Compression Ratio"
D[Unique primitive values]
E[Deeply nested unique data]
F[Large binary-like strings]
end
A --> G[Excellent]
B --> G
C --> G
D --> H[Moderate]
E --> H
F --> H
```
### Memory Usage
- Compression builds an in-memory store with hash maps for deduplication
- For very large JSON documents, consider streaming or chunked processing
- The compressed format itself is typically 30-70% smaller for repetitive data
### Compression Ratio Examples
| API response arrays | 40-60% of original |
| Configuration files | 50-70% of original |
| Unique data | 90-100% of original |
| Highly repetitive | 20-40% of original |
## Testing
Run the test suite:
```bash
cargo test
```
The library includes comprehensive tests covering:
- Number encoding edge cases
- Special values (Infinity, -Infinity, NaN)
- Unicode string handling
- Empty objects and arrays
- Null value handling
- Deeply nested structures
- Schema deduplication
## License
Licensed under the BSD-2-Clause license. See [LICENSE](LICENSE) for details.
---
## Compatibility
This library is compatible with [compress-json](https://github.com/beenotung/compress-json) v3.4.0+:
| Basic types | ✅ | ✅ | ✅ |
| `preserve_nan` | ✅ v3.4.0+ | ✅ v3.4.0+ | ✅ |
| `preserve_infinite` | ✅ v3.4.0+ | ✅ v3.4.0+ | ✅ |
| `error_on_nan` | ✅ | ✅ | ✅ |
| `error_on_infinite` | ✅ | ✅ | ✅ |
| Schema dedup | ✅ | ✅ | ✅ |
| Value dedup | ✅ | ✅ | ✅ |
Data compressed with any implementation can be decompressed by any other.
## Related Projects
- [compress-json](https://github.com/beenotung/compress-json) - Original TypeScript implementation (v3.4.0+)
- [compress-json Python](https://github.com/beenotung/compress-json/tree/main/python) - Python implementation
- [serde_json](https://github.com/serde-rs/json) - JSON serialization framework for Rust