pub struct RecordSet {
pub source: String,
pub format: String,
pub columns: Vec<Column>,
}Expand description
A normalized corpus: named columns of equal length, plus provenance about where it came from. This is the universal input to every detector.
Fields§
§source: StringLogical source identifier (path, URL, or "-" for stdin).
format: StringThe format the normalizer recognized (e.g. "csv", "ndjson").
columns: Vec<Column>Implementations§
Source§impl RecordSet
impl RecordSet
Sourcepub fn new(
source: impl Into<String>,
format: impl Into<String>,
columns: Vec<Column>,
) -> Self
pub fn new( source: impl Into<String>, format: impl Into<String>, columns: Vec<Column>, ) -> Self
Creates a record set, panicking only via debug-assert if columns are ragged. Construction is the normalizer’s responsibility; detectors may rely on rectangularity.
pub fn width(&self) -> usize
pub fn column(&self, name: &str) -> Option<&Column>
Sourcepub fn select(&self, names: &[String]) -> RecordSet
pub fn select(&self, names: &[String]) -> RecordSet
A copy keeping only the columns named in names, in their original
column order (not the order given). Names with no matching column are
silently skipped — callers that must reject an unknown name should
validate with Self::column first. Provenance is preserved.
This is the column-scoping primitive behind scan --columns: it lets a
caller focus detection on a handful of meaningful columns in a wide
corpus (e.g. journald’s dozens of identifier/counter fields).
Sourcepub fn without(&self, names: &[String]) -> RecordSet
pub fn without(&self, names: &[String]) -> RecordSet
A copy dropping the columns named in names, preserving the order and
provenance of the rest. The complement of Self::select, behind
scan --exclude.