Skip to main content

Crate reifydb_column

Crate reifydb_column 

Source
Expand description

Columnar storage engine: the immutable, on-disk representation of materialized columns plus the read-time machinery (compute kernels, predicates, scans, selection vectors, snapshots) the engine uses to query them. This crate owns the bucket layout, the per-column compression and encoding schemes, and the registry that tracks which columns are present and at what version.

Read paths come in here, get a column reader, and stream values through compute kernels that operate directly on the encoded bytes where possible - decoding only when a kernel cannot run on the encoded form. The snapshot type is what the subscription tier hands out to consumers so they can iterate over a stable view of the column without racing against ongoing writes.

Invariant: a column’s encoded bytes plus its stats and bitmap are produced together and never updated piecewise. Tearing those apart - rewriting just the values, just the bitmap, or just the stats - means readers can observe a column whose statistics no longer describe its contents, which silently corrupts every kernel that reads stats to skip work.

Modules§

bucket
compress
compute
Compute kernels that operate on encoded columns. Compare, take, slice, filter, sum, search-sorted, min/max - the primitives the engine VM dispatches to when it executes the per-instruction work of a query. Kernels prefer to run directly on the encoded bytes (canonical layout, dictionary indices, run-length runs) and only decode when they cannot.
encoding
Per-column encoding implementations. Canonical is the dense unencoded layout; the compressed family covers all-none, bit-packed, constant, delta, delta-RLE, dictionary, frame-of-reference, run-length, and sparse forms. Each encoding produces and consumes the same encoded-bytes contract so compute kernels can be written once and work across encodings.
error
predicate
reader
registry
scan
selection
snapshot