trs-dataframe
Column-oriented dataframe for the Teiresias stack. Lightweight, typed, and
designed around per-row [DataValue] candidates rather than a full ndarray
back-end.
- Typed columns with nullable bitmap. Each column is a [
TypedDataArray] wrapping a [TypedData] tagged union over native primitives (bool,u8/u32/u64,i32/i64,f32/f64,String,Vec<TypedData>) with aGenericfallback for heterogeneous data. Null positions are tracked via a packedVec<u64>bitmap — one bit per element — so typed columns can represent missing values without falling back toDataValue::Null. Hot paths can take a zero-copy&[i32]slice (or any other primitive) without going throughDataValue. - Operational primitives. Append, push rows, extend, filter, join (incl.
many-to-many by id and Cartesian product), sort + top-N, and column
add/remove. See the inherent methods on [
DataFrame] / [ColumnFrame]. - Materialized views.
selectreturns row-majorArray2<DataValue>;select_viewgives a(ncols, nrows)stacked array;select_vec_viewhands back zero-copy&TypedDataborrows;select_typedcoerces to a uniform primitive type via [Extract]. - Filter DSL.
FilterRules::try_from("a >= 1f64 && (b <= 5 || c <= 8)")— parsed expressions over column values with type-aware comparison and a small set of column functions (len,to_datetime_us, …). - Pluggable runtimes. Optional
python(PyO3 + numpy bindings),polars-df(polars::DataFrameinterop),jmalloc, andtracingfeatures. Thepythonfeature is on by default.
Install
[]
= { = "0.10", = false }
# or with Python bindings + numpy + messagepack:
# trs-dataframe = "0.10"
Quick start
use ;
// Build a frame with the `df!` macro.
let mut frame: DataFrame = df! ;
assert_eq!;
assert_eq!;
// Materialize a row-major Array2 view.
let arr = frame.select.unwrap;
assert_eq!;
// Zero-copy access requires the column to be in its native primitive
// representation. The `df!` macro starts every column as `Generic`, so
// promote the score column to `F64` first and then take a typed slice.
frame.dataframe.get_column_mut
.unwrap
.try_convert_to_dtype
.unwrap;
let cols = frame.select_vec_view.unwrap;
let scores: & = cols.as_ref.unwrap.as_slice_f64.unwrap;
assert_eq!;
// Append more rows.
let extra: DataFrame = df! ;
frame.extend.unwrap;
assert_eq!;
Filtering
use ;
let frame = df! ;
let rules = try_from.unwrap;
let filtered = frame.filter.unwrap;
assert_eq!;
Sorting and top-N
use ;
let frame = df! ;
let sorted = frame.sorted.unwrap;
let top3 = sorted.topn.unwrap;
assert_eq!;
Feature flags
| Feature | Default | Purpose |
|---|---|---|
python |
yes | PyO3 bindings, numpy interop, messagepack roundtrip. |
polars-df |
no | From/Into between polars::DataFrame and types. |
jmalloc |
no | Use jemalloc as the global allocator. |
tracing |
no | Pull in tracing-subscriber for runtime tracing setup. |
utoipa |
no | Derive OpenAPI schema for serializable types. |
Development
The benchmark harness is in benches/bench_main.rs; sample data is fetched
into benches/downloaded-data/ on first run.
License
Apache-2.0. See LICENSE.