compress-json-rs
AI-driven Rust port of the JavaScript compress-json library by Beenotung. Store JSON data in a space-efficient compressed form with lossless round-trip compression and decompression.
Table of Contents
- Features
- Installation
- Quick Start
- Usage Examples
- API Reference
- Special Values
- How It Works
- Architecture
- Compression Format
- Configuration
- Helper Functions
- Performance Considerations
- License
Features
- Full JSON Support: Objects, arrays, strings, numbers, booleans, and null
- Special Values Support: Infinity, -Infinity, and NaN with dedicated encodings (v3.2.0+)
- Value Deduplication: Repeated values stored once with reference keys
- Schema Deduplication: Objects with identical keys share schemas
- Compact Encoding: Numbers encoded in base-62 format
- Type Safety: Zero-copy round-trip using
serde_json::Value - UTF-8 Safe: Full Unicode support for strings
- No Dependencies on Disk/Network: Fast in-memory compression
- Cross-Platform Compatible: Data compatible with JavaScript and Python implementations
Installation
Add to your Cargo.toml:
[]
= "0.1.0"
= "1.0"
Quick Start
use ;
use json;
Usage Examples
Basic Object Compression
use ;
use json;
let user = json!;
let = compress;
// The compressed form is a tuple of:
// - values: Vec<String> - deduplicated value store
// - root: String - key pointing to the root value
println!;
println!;
// Restore original
let restored = decompress;
assert_eq!;
Array with Repeated Objects
use ;
use json;
// Arrays of objects with similar schemas benefit most from compression
let data = json!;
let compressed = compress;
let restored = decompress;
assert_eq!;
Serialization for Storage/Transmission
use ;
use json;
let data = json!;
// Compress
let compressed = compress;
// Serialize to JSON string for storage
let json_str = to_string.unwrap;
println!;
// Later: deserialize and decompress
let loaded: Compressed = from_str.unwrap;
let restored = decompress;
assert_eq!;
Working with Files
use ;
use json;
use fs;
// Usage
let data = json!;
save_compressed.unwrap;
let restored = load_compressed.unwrap;
Using Helper Functions
use ;
use ;
// Remove null values from objects before compression
let mut data: = from_value.unwrap;
trim_undefined;
// data now only contains "name" and "age"
// Recursively remove nulls from nested objects
let mut nested: = from_value.unwrap;
trim_undefined_recursively;
API Reference
Core Functions
/// Compressed representation: (values array, root key)
pub type Compressed = ;
/// Key type for value references
pub type Key = String;
/// Compress a JSON value into its compressed form
;
/// Decompress a compressed form back into JSON
;
/// Decode a single key from the values array
;
Lower-Level API
/// Memory structure for compression state
/// Create a new memory instance for compression
;
/// Add a value to memory, returns its reference key
;
/// Convert memory to the values array
;
Helper Functions
/// Remove keys with null values from an object (shallow)
;
/// Recursively remove keys with null values from nested objects
;
Configuration
/// Global configuration for compression behavior
pub const CONFIG: Config;
Special Values
Updated in v3.4.0 with preserve options
The library supports special floating-point values that are not part of the JSON specification.
Default Behavior (v3.4.0+)
By default, preserve_nan and preserve_infinite are false, so special values become null (like JSON.stringify):
| Value | Default (preserve_* = false) |
With preserve_* = true |
|---|---|---|
Infinity |
null |
N|+ |
-Infinity |
null |
N|- |
NaN |
null |
N|0 |
Config Options
use CONFIG;
// Default configuration (v3.4.0+)
// preserve_nan: false - NaN becomes null
// preserve_infinite: false - Infinity becomes null
// error_on_nan: false - Don't panic on NaN
// error_on_infinite: false - Don't panic on Infinity
Decoding Special Values
When receiving data from JavaScript/Python implementations that have preserve_* enabled:
use ;
// Check if a value is special
assert!;
assert!;
assert!;
// Decode special values back to f64
let inf = decode_special;
assert!;
let neg_inf = decode_special;
assert!;
let nan = decode_special;
assert!;
JSON Compatibility Note
Since JSON doesn't natively support Infinity, -Infinity, or NaN, these values become null when decompressed to serde_json::Value. However, when preserve_* options are enabled, the compressed format preserves the original special values, enabling:
- Cross-platform data exchange with JavaScript and Python implementations
- Lossless storage of special values in the compressed form
- Re-encoding with preserved semantics
String Escaping
Strings that look like special value encodings are automatically escaped:
use ;
use json;
// String "N|+" is preserved as a string, not treated as Infinity
let data = json!;
let compressed = compress;
let restored = decompress;
assert_eq!;
How It Works
The compression algorithm works by deduplicating values and encoding references using base-62 keys.
Compression Flow
flowchart TD
subgraph Input
A[JSON Value]
end
subgraph Compression Process
B[Create Memory Store]
C{Value Type?}
D[Encode Boolean]
E[Encode Number]
F[Encode String]
G[Process Array]
H[Process Object]
I[Check Value Cache]
J{Cached?}
K[Return Existing Key]
L[Generate New Key]
M[Store Value]
end
subgraph Output
N[Compressed Tuple]
O["(Vec<String>, Key)"]
end
A --> B
B --> C
C -->|bool| D
C -->|number| E
C -->|string| F
C -->|array| G
C -->|object| H
D & E & F --> I
G --> |each element| C
H --> |schema + values| C
I --> J
J -->|yes| K
J -->|no| L
L --> M
M --> K
K --> N
N --> O
Decompression Flow
flowchart TD
subgraph Input
A["Compressed (values, root)"]
end
subgraph Decompression Process
B[Parse Root Key]
C[Lookup Value in Store]
D{Value Prefix?}
E[Decode Boolean]
F[Decode Number]
G[Decode String]
H[Decode Array]
I[Decode Object]
J[Recursive Decode]
end
subgraph Output
K[JSON Value]
end
A --> B
B --> C
C --> D
D -->|"b|"| E
D -->|"n|"| F
D -->|"s|" or none| G
D -->|"a|"| H
D -->|"o|"| I
H --> J
I --> J
J --> C
E & F & G --> K
H & I --> K
Architecture
Module Structure
graph TB
subgraph Public API
LIB[lib.rs]
end
subgraph Core Modules
CORE[core.rs<br/>compress/decompress]
MEM[memory.rs<br/>value storage]
ENC[encode.rs<br/>type encoding]
end
subgraph Support Modules
NUM[number.rs<br/>base-62 conversion]
BOOL[boolean.rs<br/>bool encoding]
HELP[helpers.rs<br/>utility functions]
CFG[config.rs<br/>configuration]
DBG[debug.rs<br/>error handling]
end
LIB --> CORE
LIB --> MEM
LIB --> HELP
LIB --> CFG
CORE --> MEM
CORE --> ENC
MEM --> ENC
MEM --> NUM
MEM --> CFG
MEM --> DBG
ENC --> NUM
ENC --> BOOL
Memory Structure
classDiagram
class Memory {
-Vec~String~ store
-HashMap~String, String~ value_cache
-HashMap~String, String~ schema_cache
-usize key_count
}
class Compressed {
+Vec~String~ values
+String root_key
}
Memory --> Compressed : produces
note for Memory "Stores encoded values with<br/>deduplication via caches"
note for Compressed "Final output format:<br/>(values, root)"
Compression Format
Value Encoding Prefixes
| Prefix | Type | Example Encoded | Original Value |
|---|---|---|---|
b|T |
Boolean true | b|T |
true |
b|F |
Boolean false | b|F |
false |
n| |
Number | n|42.5 |
42.5 |
N|+ |
Infinity | N|+ |
Infinity |
N|- |
-Infinity | N|- |
-Infinity |
N|0 |
NaN | N|0 |
NaN |
s| |
Escaped string | s|n|123 |
"n|123" |
a| |
Array | a|0|1|2 |
Array with refs 0,1,2 |
o| |
Object | o|0|1|2 |
Object with schema ref |
| (none) | Plain string | hello |
"hello" |
"" or _ |
Null | `` | null |
Key Encoding (Base-62)
Keys are encoded using base-62 for compact representation:
Characters: 0-9 A-Z a-z (62 total)
Examples:
0 -> "0"
9 -> "9"
10 -> "A"
35 -> "Z"
36 -> "a"
61 -> "z"
62 -> "10"
124 -> "20"
Example Compression
graph LR
subgraph Original JSON
A["{
'name': 'Alice',
'role': 'admin'
}"]
end
subgraph Compressed Values Array
B["0: 'name,role' (schema)
1: 'Alice'
2: 'admin'
3: 'o|0|1|2' (object)"]
end
subgraph Compressed Output
C["(['name,role', 'Alice',
'admin', 'o|0|1|2'], '3')"]
end
A --> B
B --> C
Schema Sharing Example
graph TD
subgraph "Input: Array of Objects"
A["[
{ id: 1, type: 'A' },
{ id: 2, type: 'B' },
{ id: 3, type: 'A' }
]"]
end
subgraph "Compressed Values"
B["0: 'a|id,type' // shared schema
1: 'n|1'
2: 'A'
3: 'o|0|1|2' // obj 1
4: 'n|2'
5: 'B'
6: 'o|0|4|5' // obj 2
7: 'n|3'
8: 'o|0|7|2' // obj 3 (reuses 'A')
9: 'a|3|6|8' // root array"]
end
subgraph Benefits
C["✓ Schema 'id,type' stored once
✓ Value 'A' stored once
✓ Minimal storage for repetitive data"]
end
A --> B
B --> C
Configuration
The library uses a compile-time configuration:
pub const CONFIG: Config = Config ;
Behavior Notes (v3.4.0+)
-
NaN and Infinity handling depends on config options:
Value preserve_*= truepreserve_*= false,error_*= trueBoth false (default) NaN Encoded as N|0Panic Becomes nullInfinity Encoded as N|+Panic Becomes null-Infinity Encoded as N|-Panic Becomes null -
Key Order: Object keys maintain insertion order unless
sort_keyis enabled -
Unicode: Full UTF-8 support for all string values
Helper Functions
trim_undefined
Removes keys with null values from an object (shallow operation):
use trim_undefined;
use ;
let mut obj: = from_value.unwrap;
trim_undefined;
// obj = { "a": 1, "c": 3 }
trim_undefined_recursively
Removes null values from nested objects:
use trim_undefined_recursively;
use ;
let mut obj: = from_value.unwrap;
trim_undefined_recursively;
// obj = { "user": { "name": "Alice" } }
Performance Considerations
Best Use Cases
graph LR
subgraph "High Compression Ratio"
A[Arrays of similar objects]
B[Repeated string values]
C[Nested objects with shared schemas]
end
subgraph "Lower Compression Ratio"
D[Unique primitive values]
E[Deeply nested unique data]
F[Large binary-like strings]
end
A --> G[Excellent]
B --> G
C --> G
D --> H[Moderate]
E --> H
F --> H
Memory Usage
- Compression builds an in-memory store with hash maps for deduplication
- For very large JSON documents, consider streaming or chunked processing
- The compressed format itself is typically 30-70% smaller for repetitive data
Compression Ratio Examples
| Data Type | Typical Ratio |
|---|---|
| API response arrays | 40-60% of original |
| Configuration files | 50-70% of original |
| Unique data | 90-100% of original |
| Highly repetitive | 20-40% of original |
Testing
Run the test suite:
The library includes comprehensive tests covering:
- Number encoding edge cases
- Special values (Infinity, -Infinity, NaN)
- Unicode string handling
- Empty objects and arrays
- Null value handling
- Deeply nested structures
- Schema deduplication
License
Licensed under the BSD-2-Clause license. See LICENSE for details.
Compatibility
This library is compatible with compress-json v3.4.0+:
| Feature | JavaScript | Python | Rust |
|---|---|---|---|
| Basic types | ✅ | ✅ | ✅ |
preserve_nan |
✅ v3.4.0+ | ✅ v3.4.0+ | ✅ |
preserve_infinite |
✅ v3.4.0+ | ✅ v3.4.0+ | ✅ |
error_on_nan |
✅ | ✅ | ✅ |
error_on_infinite |
✅ | ✅ | ✅ |
| Schema dedup | ✅ | ✅ | ✅ |
| Value dedup | ✅ | ✅ | ✅ |
Data compressed with any implementation can be decompressed by any other.
Related Projects
- compress-json - Original TypeScript implementation (v3.4.0+)
- compress-json Python - Python implementation
- serde_json - JSON serialization framework for Rust