Skip to main content

Crate virtual_frame

Crate virtual_frame 

Source
Expand description

virtual-frame — Deterministic data pipeline toolkit for LLM training.

Bitmask-filtered virtual views, NFA regex, Kahan summation, NLP primitives, CSV ingestion, and a deterministic RNG. Python bindings via PyO3.

Modules§

bitmask
Packed bitmask — one bit per row, 64-bit words.
column
Columnar storage — typed vectors, one per column.
csv
CSV ingestion: CsvConfig, CsvReader, and StreamingCsvProcessor.
dataframe
DataFrame — columnar storage with named columns.
expr
Expression system — predicates for filter, computed columns for mutate.
kahan
Kahan compensated summation — bit-identical results regardless of platform.
nlp
NLP primitives — string distance, n-grams, tokenization.
regex_engine
NFA-based regex engine — zero-dependency, deterministic, linear-time.
rng
Deterministic RNG — SplitMix64.
tidyview
TidyView — the virtual frame engine.