datacortex-core
Core compression library for DataCortex. Lossless, format-aware compression for JSON and NDJSON data.
Beats zstd-19 and brotli-11 on every JSON file tested (+4% to +113%).
Usage
use ;
// Compress JSON/NDJSON bytes
let compressed = compress?;
// Decompress (byte-exact lossless)
let original = decompress?;
assert_eq!;
Features
- Auto-detection of JSON, NDJSON, and generic text formats
- Schema inference with 8 column types (integer, boolean, timestamp, enum, string, float, UUID, null)
- Columnar reorg for NDJSON with schema-based grouping
- Type-specific encoding (delta-varint, bitmaps, epoch deltas, frequency-sorted dictionaries)
- Auto-fallback across 6+ compression paths (zstd, brotli, with/without preprocessing)
- Custom dictionary training for known schemas
- Parallel compression via rayon (Fast mode)
Compression Modes
- Fast (default): columnar + typed encoding + zstd/brotli auto-fallback. Best for JSON/NDJSON.
- Balanced: context mixing engine with 13 specialized models. Better on general text.
- Max: same as Balanced with larger context maps.
Links
License
MIT