Expand description
Columnar storage layer for frankenpandas — provides the
Column container that backs every DataFrame column and
Series value buffer in fp-frame.
A column is a typed value buffer (DType) plus a separate
ValidityMask tracking which cells are missing. This split
mirrors Apache Arrow’s storage layout and lets the type system
enforce correctness on the dense-value side while keeping
pandas-style missing-value semantics (NullKind::Null,
NullKind::NaN, NullKind::NaT) on the validity side.
§Public surface
Column: the public columnar container. Built from aDType+ aVec<Scalar>. Exposes value access (Column::value,Column::values), reductions (Column::sum,Column::mean,Column::count, the nan-aware aggregations from fp-types), and typed binary operations dispatched throughArithmeticOp/ComparisonOp.ColumnData: the inner enum holding the dense buffer. Most callers go throughColumnrather than touching this directly.SparseColumn: opt-in sparse encoding (paired value buffer + index-of-non-fill positions). Stored alongside the denseColumnfor backwards compat when consumers only needColumn.ValidityMask: per-cell missing-value bitmap. Stored onColumn; exposed for users that want to compose masks directly (logical masking, conditional updates, etc.).ArithmeticOp/ComparisonOp: enum tags for typed binary-op dispatch (used by fp-frame’s expression engine and Series arithmetic).CrackIndex: an internal positional index used by the “cracking” optimisation for repeated boolean-mask filters.
§Error reporting
ColumnError enumerates the failure modes (length mismatch,
dtype mismatch, missing-value-in-required-slot, etc.). All
Column-mutating fns return Result<_, ColumnError> so callers
get explicit error categories.
§Relationship to other crates
- fp-types supplies the
DType/Scalar/NullKind/nan*reduction primitives this crate composes on top of. - fp-frame stores a
Vec<Column>perDataFrame(one column per data column) plus a separateIndexfrom fp-index for the row labels. - fp-index uses
Columninternally for some MultiIndex level storage.
Structs§
- Column
- Crack
Index - Adaptive crack index for progressive column partitioning.
- Sparse
Column - Validity
Mask
Enums§
- Arithmetic
Op - Column
Data - AG-10: Typed array representation for vectorized batch execution.
- Column
Error - Comparison
Op - Element-wise comparison operations that produce
Bool-typed columns.
Functions§
- radix_
argsort_ i64 - Stable LSD radix argsort of an
i64slice (br-frankenpandas-y5s15): the permutation that ordersvaluesascending (or descending), equal values keeping their original order. Bit-identical to a stablesort_by(i64::cmp):i64_radix_keyis order-preserving and the counting sort is stable; descending flips the key (!key) so equal values still keep original order (matching a reversed comparator whoseEqualarm doesn’t reorder). Reusable for any all-Int64 ordering (index labels, single columns). - radix_
argsort_ multi_ u64 - Stable LSD radix lexsort over several
u64key columns (br-frankenpandas-lnsu6). Returns the permutation that orders rows lexicographically bykeys_by_col[0], thenkeys_by_col[1], …, with equal rows keeping their original order — exactly a stable multi-keysort_by. The least-significant digit overall is the last column’s low byte, so the columns are processed in reverse (each an 8-pass stable counting sort that threads the running permutation), making the first column the most significant. O(n·k) and comparison-free. All key vectors must have the same length; callers bake per-column ascending/descending into the keys.