Skip to main content

Module ingest

Module ingest 

Source
Expand description

S-REAL.1 — Deterministic ingest of # residual-projection v2 TSV fixtures.

WHY: The S-REAL.1 audit gauntlet runs DSFB-GPU on real public datasets whose canonical released form is pre-projected residuals in a TAB-delimited text file with # key=value comment headers. DSFB-GPU’s production pipeline normally takes a Vec<TraceEvent> (trace-event stream) and projects events into residuals via the window-feature kernel. The TSV fixtures are already past that projection stage. To run the deterministic engine on this form without modifying the dispatcher, we deterministically lower each (window, signal) cell into a synthetic TraceEvent whose latency value carries the residual magnitude. The lowering rule is byte-replayable: same TSV bytes → same Vec<TraceEvent> → same CaseFile.

Discipline:

  • No probabilistic ingest: every cell maps to one event by a fixed rule.
  • No silent NaN handling: NaN cells are skipped (no event emitted for that cell), and the count of skipped cells is reported in the IngestReport so the audit report can disclose the projection loss.
  • SHA-256 byte-pin: the loader returns an error if the file bytes do not hash to the expected pin. The audit’s dataset_manifest.toml records the pin alongside the upstream DOI/URL.
  • No domain-truth claim: the lowering interprets cell values as “milliseconds-scaled signal magnitude” and reports that convention in the audit; we make no claim about what the cell value “is” in the upstream domain.

Non-claims (preserved into the audit’s limitations.md):

  • This loader does NOT recover the upstream’s original trace events. It produces a deterministic event sequence whose post-projection residual matches the fixture’s residual at each cell, under the documented lowering rule.
  • This loader does NOT validate the upstream dataset’s labels, semantics, ground truth, or fitness for any downstream use.

License: Apache-2.0. Background IP: Invariant Forge LLC.

Structs§

IngestReport
What the loader did. Surfaced into the audit report so the human reader can see exactly how many cells were skipped (NaN), how many events were emitted, and what the input/output shapes were.
LoweringConfig
Parameters for the deterministic event-lowering rule.
ResidualProjectionFixture
Parsed # residual-projection v2 fixture.

Enums§

IngestError
Loader errors. Every variant has an explicit message so the CLI can surface a human-actionable diagnosis without inventing free-form text.

Functions§

build_ingest_report
Build the ingest report from a fixture + emitted events.
load_residual_projection_tsv
Parse a # residual-projection v2 TSV fixture.
lower_to_trace_events
Apply the panel-locked deterministic lowering rule.
sha256_to_hex_lower
Convert a SHA-256 byte array to lowercase hex.
verify_fixture_sha256
Verify file bytes hash to the expected pin.