1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
//! `matten-data` — a tiny table-to-Tensor preparation companion for small PoC
//! datasets.
//!
//! # Status
//!
//! **Beta.** This is a scope-locked companion (RFC-033) for the boring
//! step between table-like input and a numeric [`matten::Tensor`]. The API is
//! mostly stable but pre-1.0; pin the minor version. Under lock-step family versioning
//! (RFC-030) the crate shares the workspace family version; maturity is the Status
//! label, not the version number.
//!
//! # The workflow
//!
//! ```text
//! small CSV / table-like data
//! -> inspect schema
//! -> select columns by name
//! -> clean missing values explicitly
//! -> convert to numeric explicitly
//! -> matten::Tensor
//! ```
//!
//! ```
//! # #[cfg(not(feature = "csv"))] fn main() {}
//! # #[cfg(feature = "csv")] fn main() -> Result<(), matten_data::MattenDataError> {
//! use matten_data::Table;
//!
//! let csv = "sales,cost,note\n10,2,a\n20,,b\n30,4,c";
//! let table = Table::from_csv_str(csv)?;
//!
//! // Inspect, select, clean, convert — every step explicit.
//! let tensor = table
//! .select_columns(["sales", "cost"])?
//! .fill_missing(0.0)?
//! .try_numeric()?
//! .to_tensor()?;
//!
//! assert_eq!(tensor.shape(), &[3, 2]);
//! assert_eq!(tensor.as_slice(), &[10.0, 2.0, 20.0, 0.0, 30.0, 4.0]);
//! # Ok(())
//! # }
//! ```
//!
//! # What it is not
//!
//! `matten-data` is **not a dataframe library**. It has no joins, group-by, pivot,
//! query DSL, lazy execution, indexing/`loc`/`iloc`, rolling/window operations,
//! datetime engine, categorical dtype system, or large-data streaming. For those
//! workloads use [Polars](https://pola.rs), [DataFusion](https://datafusion.apache.org),
//! Pandas, or another dataframe/query tool. It is a small conversion helper for
//! application-validated or trusted data, not a CSV firewall or input sandbox.
//!
//! # Relationship to core `dynamic`
//!
//! Core `matten`'s `dynamic` feature is *value-level* ingestion (mixed values
//! inside a `Tensor`, with explicit `try_numeric()`). `matten-data` is *table-level*
//! preparation (headers, named columns, schema summary, table-shaped missing-value
//! policy) whose end goal is a numeric `Tensor`. It does not expose a second
//! computation engine.
//!
//! # Conversion rules
//!
//! Numeric conversion is strict and explicit (`try_numeric` then `to_tensor`):
//! integers and floats become `f64`; booleans and non-numeric text are rejected;
//! a remaining missing cell is rejected (fill it first). Missing values never
//! silently become zero, and booleans never silently become `1`/`0`.
pub use MattenDataError;
pub use NumericTable;
pub use ;
pub use ;