Skip to main content

Module generic_csv

Module generic_csv 

Source
Expand description

Generic CSV adapter — a single-domain worked example of applying dsfb-database’s motif grammar to a residual stream that was not captured from a SQL engine.

§What this adapter does

Read an operator-supplied CSV with a timestamp column and a numeric value column (and optionally a channel column), construct a residual stream via the same rolling-baseline rule as the PostgreSQL pg_stat_statements adapter (see crate::adapters::postgres), and hand the resulting stream to the motif grammar.

§What this adapter does NOT do

It does not validate that the operator-supplied grammar is appropriate for the input signal, nor does it claim the five-motif vocabulary has any universal meaning outside SQL telemetry. This adapter is a worked example that lets an operator exercise the deterministic machinery on their own residuals; it is not a generalisation claim. See the pinned non-claim in crate::non_claims that references this adapter by name.

§CSV contract

Minimum: one timestamp column, one value column. The adapter auto-detects both:

  • Timestamp column — first column whose header contains any of {"t", "time", "timestamp", "ts"} (case-insensitive) or whose first data row parses as an f64. Operators can override with --time-col <name>.
  • Value column — first numeric column that is not the timestamp and whose header is not recognisably a key (id, key, uuid, hash). Operators can override with --value-col <name>.
  • Channel column — optional. If any column is named channel, qclass, group, or series (case-insensitive), the adapter emits one residual per (timestamp, channel) row; otherwise the channel defaults to generic.

§Residual construction

The adapter emits every row’s value as a ResidualClass::PlanRegression residual in the dimensionless form (value − baseline) / max(|baseline|, ε), where baseline is the mean of the first BASELINE_WINDOW = 3 values on that channel. This is the same normalisation the PostgreSQL adapter performs; it keeps the drift_threshold in spec/motifs.yaml interpretable as a dimensionless fraction.

--pre-residualized skips the baseline subtraction and emits value verbatim. Use this when the operator’s pipeline already produces (actual − expected) residuals.

Structs§

GenericCsvOptions
Options for the generic CSV loader. Empty for “auto-detect everything”.

Functions§

load_generic_csv
Load path as a generic CSV and produce a typed residual stream.