succinctly
High-performance succinct data structures for Rust.
Succinctly provides space-efficient data structures with fast rank and select operations, optimized for both x86_64 (with AVX2/AVX-512) and ARM (NEON) architectures. The library is no_std compatible and designed for high-throughput applications.
Features
- Bitvector with O(1) rank and O(log n) select - Poppy-style 3-level directory with ~3% space overhead
- Balanced parentheses for tree navigation - RangeMin structure with O(1) operations and ~6% overhead
- JSON semi-indexing with SIMD acceleration - Up to 950 MiB/s throughput on x86_64 (AMD Zen 4) with table-driven PFSM parser
- YAML semi-indexing - Complete YAML 1.2 parser with anchor/alias resolution (~250-400 MiB/s)
- DSV/CSV semi-indexing - High-performance CSV/TSV parsing (85-1676 MiB/s) with BMI2 acceleration
- jq/yq-style query expressions - Navigate JSON and YAML without full parsing
no_stdcompatible - Works in embedded and WASM environments- Cross-platform SIMD - Runtime detection for AVX2, AVX-512, SSE4.2, and ARM NEON
Installation
Add to your Cargo.toml:
[]
= "0.2"
Or with cargo:
Quick Start
Bitvector with Rank/Select
use ;
// Create a bitvector from u64 words
let words = vec!;
let bv = from_words;
// Query rank (count of 1-bits in [0, i))
assert_eq!;
// Query select (position of k-th 1-bit, 0-indexed)
assert_eq!; // First 1-bit is at position 1
assert_eq!; // Fourth 1-bit is at position 7
Balanced Parentheses for Tree Navigation
use BalancedParens;
// Encode a tree as balanced parentheses: ((()())())
// In bits: 1=open, 0=close -> 1110100100
let bp = new;
// Find matching close parenthesis
assert_eq!; // Outermost pair
assert_eq!; // Second level
// Navigate the tree
assert_eq!; // First child of root
assert_eq!; // Sibling of first child
JSON Semi-Indexing
use ;
let json = br#"{"users": [{"name": "Alice"}, {"name": "Bob"}]}"#;
let index = build;
let root = index.root;
// Navigate without parsing the entire document
if let Object = root
jq-Style Queries
use ;
use ;
let json = br#"{"users": [{"name": "Alice", "age": 30}, {"name": "Bob", "age": 25}]}"#;
let index = build;
let cursor = index.root;
// Get all user names
let expr = parse.unwrap;
if let Many = eval
// Get first user's age
let expr = parse.unwrap;
if let One = eval
Performance
Comparison with Rust JSON Parsers
Benchmark comparing succinctly against popular Rust JSON libraries (x86_64, 1MB file):
| Library | Parse Time | Throughput | Peak Memory | vs succinctly |
|---|---|---|---|---|
| sonic-rs | 1.00 ms | 810 MiB/s | 9.97 MB | 1.6x faster, 26x more memory |
| succinctly | 1.58 ms | 510 MiB/s | 382 KB | baseline |
| serde_json | 4.83 ms | 167 MiB/s | 7.00 MB | 3.1x slower, 18x more memory |
| simd-json | 5.10 ms | 158 MiB/s | 17.1 MB | 3.2x slower, 46x more memory |
Key findings:
- 18-46x less memory than DOM parsers (consistent 46% overhead vs JSON size)
- Competitive parsing speed: Only 1.6x slower than fastest DOM parser
- 3-5x faster than serde_json and simd-json
See docs/benchmarks/rust-parsers.md for detailed benchmarks across all file sizes.
JSON Semi-Indexing
| Platform | Implementation | Throughput | Notes |
|---|---|---|---|
| x86_64 (AMD Ryzen 9 7950X) | PFSM (BMI2) | 679 MiB/s | Table-driven, fastest on x86 |
| AVX2 | 732 MiB/s | 32 bytes/iteration | |
| Scalar | 494 MiB/s | Baseline | |
| ARM (Apple M1 Max) | NEON | 574 MiB/s | 32 bytes/iteration |
| Scalar (PFSM) | 557 MiB/s | Table-driven | |
| Scalar | 405 MiB/s | Baseline |
Rank/Select Operations
| Platform | Rank (O(1)) | Select (O(log n)) |
|---|---|---|
| x86_64 (AMD Ryzen 9 7950X) | ~3 ns | ~50 ns |
| ARM (Apple M1 Max) | ~21 ns | ~320 ns |
jq Comparison: Identity Operation (.)
Comparison of succinctly jq . vs jq . for formatting/printing JSON files.
x86_64 (AMD Ryzen 9 7950X, 10MB files)
| Pattern | jq | succinctly | Speedup | jq Mem | succ Mem | Mem Ratio |
|---|---|---|---|---|---|---|
| nested | 147.8ms | 54.5ms | 2.7x | 22 MB | 26 MB | 1.16x |
| strings | 141.3ms | 59.0ms | 2.4x | 14 MB | 14 MB | 1.02x |
| numbers | 268.7ms | 127.3ms | 2.1x | 87 MB | 15 MB | 0.18x |
| pathological | 796.7ms | 410.2ms | 1.9x | 472 MB | 17 MB | 0.04x |
| comprehensive | 381.9ms | 220.9ms | 1.7x | 149 MB | 14 MB | 0.09x |
| users | 205.6ms | 125.4ms | 1.6x | 68 MB | 12 MB | 0.18x |
| unicode | 169.4ms | 113.0ms | 1.5x | 31 MB | 16 MB | 0.51x |
| arrays | 738.2ms | 388.9ms | 1.9x | 367 MB | 17 MB | 0.05x |
| literals | 279.5ms | 228.7ms | 1.2x | 50 MB | 16 MB | 0.33x |
ARM (Apple M1 Max, 10MB files)
| Pattern | jq | succinctly | Speedup | jq Mem | succ Mem | Mem Ratio |
|---|---|---|---|---|---|---|
| nested | 347.2ms | 54.8ms | 6.3x | 25 MB | 29 MB | 1.17x |
| strings | 331.0ms | 75.2ms | 4.4x | 16 MB | 17 MB | 1.07x |
| pathological | 1.35s | 677.0ms | 2.0x | 526 MB | 20 MB | 0.04x |
| unicode | 334.8ms | 157.1ms | 2.1x | 41 MB | 19 MB | 0.46x |
| users | 413.7ms | 196.8ms | 2.1x | 70 MB | 15 MB | 0.21x |
| comprehensive | 692.4ms | 364.7ms | 1.9x | 135 MB | 17 MB | 0.13x |
| numbers | 370.7ms | 213.8ms | 1.7x | 97 MB | 18 MB | 0.19x |
| arrays | 1.07s | 700.8ms | 1.5x | 368 MB | 20 MB | 0.05x |
| literals | 510.9ms | 397.1ms | 1.3x | 103 MB | 19 MB | 0.19x |
Key findings:
- 1.2-6.3x faster across all patterns
- 5-25x less memory on most patterns due to streaming lazy evaluation
- Best speedup on nested/string-heavy data
Platform-Specific Optimizations
| Platform | Operation | Throughput | Speedup |
|---|---|---|---|
| x86_64 | Popcount (AVX-512 VPOPCNTDQ) | 96.8 GiB/s | 5.2x vs scalar |
| ARM | NEON JSON (string-heavy) | 3.7 GiB/s | 1.69x vs scalar |
See docs/benchmarks/rust-parsers.md, docs/benchmarks/jq.md, and docs/optimizations/history.md for detailed benchmarks.
Feature Flags
Popcount Strategies (mutually exclusive)
| Feature | Description |
|---|---|
| (default) | Uses Rust's count_ones() which auto-vectorizes |
simd. |
Explicit SIMD intrinsics (NEON on ARM, POPCNT on x86) |
portable-popcount |
Portable bitwise algorithm (no intrinsics) |
Other Features
| Feature | Description |
|---|---|
std |
Enable std library (default, required for runtime CPU detection) |
serde |
Enable serialization/deserialization support |
cli |
Build the CLI tool |
regex |
Enable regex support in jq queries |
Test Features
| Feature | Description |
|---|---|
large-tests |
Test with 1GB bitvectors (~125MB RAM) |
huge-tests |
Test with 5GB bitvectors (~625MB RAM) |
mmap-tests |
Memory-mapped file tests |
CLI Tool
The library includes CLI tools for JSON, YAML, and DSV operations:
# Build the CLI
# Generate synthetic JSON/YAML for benchmarking
# Query JSON files (jq-compatible)
# Query YAML files (yq-compatible)
# Query CSV/TSV files (converted to JSON on-the-fly)
# Output as CSV/TSV/DSV
# Find jq/yq expression for a position (useful for editor integration)
Architecture
Module Structure
succinctly
├── bits # Bitvector with rank/select
├── trees # Tree encodings (balanced parentheses)
├── json # JSON semi-indexing
├── yaml # YAML semi-indexing
├── dsv # DSV/CSV semi-indexing
└── jq # jq/yq query language evaluator
Core Data Structures
bits::BitVec- Bitvector with 3-level Poppy-style rank directory (~3% overhead) and sampled select index (~1-3% overhead)trees::BalancedParens- Hierarchical min-excess structure for O(1) tree navigation (~6% overhead)json::JsonIndex- Semi-index combining Interest Bits (IB) and Balanced Parentheses (BP) for fast JSON navigationyaml::YamlIndex- YAML semi-index with anchor/alias resolution and multi-document supportdsv::DsvIndex- Lightweight DSV semi-index with BMI2-accelerated quote masking
SIMD Support
The library uses runtime CPU feature detection to select the best implementation:
| Platform | Features Used |
|---|---|
| x86_64 | AVX2, AVX-512 VPOPCNTDQ, SSE4.2, SSE2 |
| aarch64 | NEON (mandatory) |
Documentation
Choose your path:
- New to succinctly? -> Getting Started
- Using the library? -> API Guide
- Using the CLI? -> CLI Guide
- Contributing? -> CONTRIBUTING.md
- Performance tuning? -> Optimization Techniques
- Understanding internals? -> Architecture
- Benchmarks? -> Performance Comparisons
- Full documentation map -> docs/
For AI-assisted development, see CLAUDE.md.
Contributing
Contributions are welcome! Please see CONTRIBUTING.md for guidelines.
Before submitting a PR:
# Run tests
# Run clippy
# Format code
License
Licensed under the MIT license.
Acknowledgments
This library implements algorithms from:
- Vigna, "Broadword Implementation of Rank/Select Queries" (WEA 2008)
- Zhou, Andersen, Kaminsky, "Space-Efficient, High-Performance Rank & Select" (SEA 2013)
- Sadakane & Navarro, "Fully-Functional Succinct Trees" (SODA 2010)
- Langdale & Lemire, "Parsing Gigabytes of JSON per Second" (VLDB 2019)
- Muła, Kurz, Lemire, "Faster Population Counts Using AVX2 Instructions" (2016)
Related Work
- haskell-works - Haskell implementations of the same techniques (
hw-json,hw-json-simd,hw-rankselect,hw-balancedparens)