Skip to main content

Crate matten_data

Crate matten_data 

Source
Expand description

matten-data — a tiny table-to-Tensor preparation companion for small PoC datasets.

§Status

Beta. This is a scope-locked companion (RFC-033) for the boring step between table-like input and a numeric matten::Tensor. The API is mostly stable but pre-1.0; pin the minor version. Under lock-step family versioning (RFC-030) the crate shares the workspace family version; maturity is the Status label, not the version number.

§The workflow

small CSV / table-like data
  -> inspect schema
  -> select columns by name
  -> clean missing values explicitly
  -> convert to numeric explicitly
  -> matten::Tensor
use matten_data::Table;

let csv = "sales,cost,note\n10,2,a\n20,,b\n30,4,c";
let table = Table::from_csv_str(csv)?;

// Inspect, select, clean, convert — every step explicit.
let tensor = table
    .select_columns(["sales", "cost"])?
    .fill_missing(0.0)?
    .try_numeric()?
    .to_tensor()?;

assert_eq!(tensor.shape(), &[3, 2]);
assert_eq!(tensor.as_slice(), &[10.0, 2.0, 20.0, 0.0, 30.0, 4.0]);

§What it is not

matten-data is not a dataframe library. It has no joins, group-by, pivot, query DSL, lazy execution, indexing/loc/iloc, rolling/window operations, datetime engine, categorical dtype system, or large-data streaming. For those workloads use Polars, DataFusion, Pandas, or another dataframe/query tool. It is a small conversion helper for application-validated or trusted data, not a CSV firewall or input sandbox.

§Relationship to core dynamic

Core matten’s dynamic feature is value-level ingestion (mixed values inside a Tensor, with explicit try_numeric()). matten-data is table-level preparation (headers, named columns, schema summary, table-shaped missing-value policy) whose end goal is a numeric Tensor. It does not expose a second computation engine.

§Conversion rules

Numeric conversion is strict and explicit (try_numeric then to_tensor): integers and floats become f64; booleans and non-numeric text are rejected; a remaining missing cell is rejected (fill it first). Missing values never silently become zero, and booleans never silently become 1/0.

Structs§

ColumnSummary
Per-column entry in a SchemaSummary.
NumericTable
A table whose cells have all been validated as numeric (RFC-034 §4.4).
SchemaSummary
A small, displayable description of a Table’s columns.
Table
A small, owned, rectangular table-like data set.

Enums§

CellValue
A single table cell value (RFC-034 §4.2).
ColumnKind
A simple, inferred kind for a column (RFC-035 §5).
MattenDataError
Errors produced by matten-data table ingestion and conversion.