Expand description
Phase-C2: streaming residual construction.
The batch adapter path calls ResidualStream::push followed by
ResidualStream::sort; it produces the bytewise-identical stream
the four fingerprint locks pin. That path is unchanged and remains
the canonical construction for reproduction.
This module adds a parallel, additive API for a live-ingestion
deployment where samples arrive one at a time and a single terminal
.sort() call over a materialised 10⁶-sample buffer is not
acceptable. The streaming path preserves time-ordering via a
bounded reorder buffer: every incoming sample is staged in a
small heap, and any sample older than newest_t − reorder_window_s
is flushed to the underlying stream in sorted order. At
StreamingIngestor::finish the remaining buffer tail is drained.
The trade-off is explicit: if a sample arrives with a time delta
greater than reorder_window_s behind the current newest sample,
it is dropped and the drop is counted — dropped_out_of_window is
part of the closing summary. A production deployment sizes
reorder_window_s to be larger than the engine’s maximum
telemetry-pipeline jitter (we default to 10 s; PostgreSQL’s
pg_stat_statements polling cadence at 60 s makes 10 s a ~6×
safety margin).
Determinism: given the same input stream and reorder_window_s,
the flushed sample order and the dropped_out_of_window count are
deterministic. The streaming path is not expected to produce
the same fingerprint as the batch path for real-world jitter-bearing
inputs — that is the honest reason batch is pinned and streaming is
parallel, not a replacement.
Structs§
- Streaming
Ingestor - Streaming ingestor that accepts one
ResidualSampleat a time and flushes a correctly-ordered prefix into an ownedResidualStreamas the reorder window slides forward.
Constants§
- DEFAULT_
REORDER_ WINDOW_ S - Default reorder-buffer window in seconds. Sized for
pg_stat_statements-class telemetry jitter. Tune up for slower engines, tune down for well-behaved log tails.