pub struct RedexFile { /* private fields */ }Expand description
A handle to a RedEX file. Cheap to clone.
Created via super::Redex::open_file.
Implementations§
Source§impl RedexFile
impl RedexFile
Sourcepub fn name(&self) -> &ChannelName
pub fn name(&self) -> &ChannelName
The channel name this file is bound to.
Sourcepub fn config(&self) -> &RedexFileConfig
pub fn config(&self) -> &RedexFileConfig
The config this file was opened with.
Sourcepub fn retained_bytes(&self) -> u64
pub fn retained_bytes(&self) -> u64
Sum of payload_len across every currently-retained entry.
Reflects the substrate’s authoritative view of bytes-on-disk
after retention trim; callers that maintain their own
monotonic byte counters (e.g. greedy’s GreedyCacheRegistry)
use this to resync periodically. O(n) over the retained
index; cheap for the typical few-K-entry case but the caller
shouldn’t call it per-event.
Sourcepub fn next_seq(&self) -> u64
pub fn next_seq(&self) -> u64
Next sequence to be assigned (== total append count since open, including any evicted head).
Pre-fix, this read next_seq outside the state
lock. append / append_batch etc. allocate a seq via
fetch_add before the disk write and fetch_sub-rollback
on failure — both within the state-lock critical section.
A concurrent reader without the lock could observe the
temporarily-bumped value: external observers (metrics,
snapshot logic, an IndexStart::FromSeq(next_seq())
re-tail) believed a seq existed that was never durably
appended. Taking the state lock here serializes the read
with the append’s commit-or-rollback, so callers only
observe values that have been durably committed (or
never assigned).
Sourcepub fn len_and_next_seq(&self) -> (usize, u64)
pub fn len_and_next_seq(&self) -> (usize, u64)
Atomic snapshot of (len, next_seq). Observers that need
both values consistently with each other (e.g. metrics
dashboards comparing “retained count” to “total seqs
assigned”) should use this rather than len() followed
by next_seq().
Pre-fix observers called the two methods in sequence.
Each method takes the state lock individually, so the
per-call view is consistent — but two appends could
commit between the two reads, and the resulting pair
could satisfy len + 1 > next_seq_seen (reader saw
post-append len but pre-append next_seq). Observers
downstream of a metrics tick would then double-account
the in-flight seqs. This single-lock accessor returns
both values from one critical section so the snapshot is
strictly consistent.
Sourcepub fn is_closed(&self) -> bool
pub fn is_closed(&self) -> bool
Whether Self::close has run. After close, tail streams
terminate with Err(Closed) and append/append_* reject
with the same error.
Sourcepub fn lowest_retained_seq(&self) -> Option<u64>
pub fn lowest_retained_seq(&self) -> Option<u64>
Lowest sequence number currently retained in the in-memory
index, or None when the file holds nothing.
next_seq() - lowest_retained_seq() is not the count —
retention is by event-or-byte-or-age, not by contiguous range.
Use this for “where can I safely tail from without triggering
Lagged?”: passing the returned seq (or higher) to
Self::tail guarantees the backfill does not signal
retention-induced history loss.
Sourcepub fn append(&self, payload: &[u8]) -> Result<u64, RedexError>
pub fn append(&self, payload: &[u8]) -> Result<u64, RedexError>
Append one event. Returns the assigned sequence.
Failure modes: PayloadTooLarge if the segment is full or the
offset would overflow u32; Io(_) if the disk mirror fails
under redex-disk. Failure atomicity: memory is committed
only after the disk write succeeds (for persistent files);
next_seq is rolled back on disk failure so no seq number is
burnt and no in-memory entry diverges from disk.
Notify-vs-fsync ordering (durability contract). Under
every FsyncPolicy except a hypothetical “always fsync on
append” (which does not exist — the per-write latency
penalty would dominate), notify_watchers fires while the
new event is still only in the kernel page cache. A crash
between notify and the next scheduled fsync (per
FsyncPolicy::EveryN, Interval, or IntervalOrBytes)
loses the event from disk after subscribers have already
observed it.
Watchers persisting derived state past the live event must therefore:
- Key durable records idempotently on
(channel, seq). - On restart, consult the on-disk file’s
Self::next_seqand tail fromlast_persisted_local_seq + 1, treating any seq above the local watermark and at-or-below the file’s current head as a re-delivery — re-apply only if not already reflected in derived state.
FsyncPolicy::Never (heap-only or fire-and-forget disk)
makes the same contract apply to the in-memory state on
process restart; persistent files with a non-Never policy
narrow the window but do not close it. The cortex adapter’s
applied_through_seq watermark + snapshot/restore pair
already implements the idempotent-reconcile recipe; ad-hoc
subscribers (a future netdb-watcher persisting elsewhere)
must mirror it.
Sourcepub fn append_inline(&self, payload: &[u8; 8]) -> Result<u64, RedexError>
pub fn append_inline(&self, payload: &[u8; 8]) -> Result<u64, RedexError>
Append a fixed-length 8-byte inline payload. Skips the segment
indirection. Returns the assigned sequence. Same failure-
atomicity contract as Self::append.
Sourcepub fn append_batch(
&self,
payloads: &[Bytes],
) -> Result<Option<u64>, RedexError>
pub fn append_batch( &self, payloads: &[Bytes], ) -> Result<Option<u64>, RedexError>
Append many payloads. Returns Some(seq) of the FIRST event
in the batch, or None for an empty input. All entries
land contiguously in the index.
§Empty input
Pre-fix the signature was Result<u64, RedexError> and an
empty payloads returned Ok(next_seq) — the seq value
the next append would receive. Callers couldn’t
distinguish “wrote zero, seq N would be next” from “wrote
one event with seq N” via the return value alone.
Breaking change: signature is now
Result<Option<u64>, RedexError>. None ⇒ empty input,
no events appended; Some(seq) ⇒ first seq of the batch.
Callers iterating over optionally-empty batches no longer
need an is_empty pre-check.
Failure atomicity:
- seq numbers are allocated after the batch is validated to fit (both segment capacity and u32 offset width);
- for persistent files, the batch is written to disk before any in-memory commit — on disk failure the seq allocation rolls back and neither memory nor subscribers observe the batch.
Sourcepub fn append_ordered(&self, payload: &[u8]) -> Result<u64, RedexError>
pub fn append_ordered(&self, payload: &[u8]) -> Result<u64, RedexError>
Strictly-ordered variant of Self::append.
Both Self::append and this method now take the state lock
before allocating a sequence number (the failure-atomicity
fix required moving fetch_add inside the lock so rollback
on disk-write failure is safe). That means append already
produces in-seq-order index insertions under contention, and
the two paths are functionally equivalent for single writes.
The real distinction is at the wrapper level: this method
pairs with Self::append_batch_ordered, which holds ONE
lock across an entire batch, whereas Self::append_batch
also holds one lock per batch today. In v1 the non-ordered
and ordered paths are nearly identical. The distinction is
kept so that a future optimization of Self::append (e.g.
moving the seq allocation back outside the lock with a
different rollback scheme) doesn’t affect callers who need
guaranteed-ordered appends.
Used by super::OrderedAppender for replay determinism.
Same failure-atomicity contract as Self::append.
Sourcepub fn append_inline_ordered(
&self,
payload: &[u8; 8],
) -> Result<u64, RedexError>
pub fn append_inline_ordered( &self, payload: &[u8; 8], ) -> Result<u64, RedexError>
Ordered variant of Self::append_inline. See
Self::append_ordered. Same failure-atomicity contract.
Sourcepub fn append_batch_ordered(
&self,
payloads: &[Bytes],
) -> Result<Option<u64>, RedexError>
pub fn append_batch_ordered( &self, payloads: &[Bytes], ) -> Result<Option<u64>, RedexError>
Ordered variant of Self::append_batch. The whole batch is
appended under one state-lock acquisition, so it’s both
atomic (all-or-nothing within the batch) and strictly
seq-ordered relative to any other ordered writers. Same
failure-atomicity contract as Self::append_batch.
Returns Some(first_seq) on a non-empty batch and None
on empty input — same convention as append_batch.
Sourcepub fn append_and_fold<T, F, S>(
&self,
value: &T,
state: &mut S,
fold_fn: F,
) -> Result<u64, RedexError>
pub fn append_and_fold<T, F, S>( &self, value: &T, state: &mut S, fold_fn: F, ) -> Result<u64, RedexError>
Append value (postcard-serialized) AND run fold_fn against
caller-supplied state in the same call. Returns the
assigned seq.
This is the common “log-an-event and update a materialized
view in one step” pattern, without spinning up a full CortEX
adapter. Callers maintain state themselves; the fold
closure sees the just-appended value and mutates state in
place.
Note: the fold runs AFTER the RedEX append. If the append
succeeds and the fold panics, the log advances but state is
out of sync — callers who need crash-consistency should use
the CortEX adapter’s durable snapshot + open_from_snapshot
instead.
Sourcepub fn append_postcard<T: Serialize>(
&self,
value: &T,
) -> Result<u64, RedexError>
pub fn append_postcard<T: Serialize>( &self, value: &T, ) -> Result<u64, RedexError>
Convenience: serialize value with postcard and append.
Sourcepub fn tail(
&self,
from_seq: u64,
) -> impl Stream<Item = Result<RedexEvent, RedexError>> + Send + 'static
pub fn tail( &self, from_seq: u64, ) -> impl Stream<Item = Result<RedexEvent, RedexError>> + Send + 'static
Subscribe to all events with seq >= from_seq, including those
already in the index at call time.
Backfill and live registration happen atomically under the state lock: no event can interleave between backfill delivery and live subscription.
Delivery is backed by a per-subscription bounded channel of
depth RedexFileConfig::tail_buffer_size.
- Backfill overflow (requested
from_seqproduces more retained events than the buffer can hold): pre-flighted under the state lock; the subscriber observesRedexError::Laggedas the first stream item and no truncated history. Guaranteed deliverable because the channel is empty at that point. - Live overflow (subscriber falls behind during live
delivery): disconnected with a best-effort
RedexError::Lagged. Under sustained saturation the signal itself may be dropped (the channel is full when we try to enqueue it), in which case the subscriber sees a clean stream end.
Sourcepub fn read_range(&self, start: u64, end: u64) -> Vec<RedexEvent>
pub fn read_range(&self, start: u64, end: u64) -> Vec<RedexEvent>
One-shot read of the half-open range [start, end) from the
in-memory index. Returns only entries currently retained;
silently skips any seqs that have been evicted.
state.index is sorted by seq by construction, so the
lookup uses slice::partition_point — O(log N) — to
locate both range bounds. Pre-fix perf #52 in
docs/performance/net-perf-analysis.md this iterated the
whole index linearly to find start, then walked the tail
to find end — O(N) per read regardless of range size.
Sourcepub fn read_one(&self, seq: u64) -> Option<RedexEvent>
pub fn read_one(&self, seq: u64) -> Option<RedexEvent>
Read a single event by sequence. Returns None when seq
has been evicted or was never appended. Convenience over
[Self::read_range(seq, seq + 1)]; saves the Vec
allocation when the caller only wants one event.
O(log N) via slice::partition_point — pre-fix this was
O(N), scanning the entire index even for trivially-absent
seqs (perf #52).
Sourcepub async fn resolve_one(&self, seq: u64) -> Result<Option<Bytes>, BlobError>
pub async fn resolve_one(&self, seq: u64) -> Result<Option<Bytes>, BlobError>
Read a single event and resolve its payload — inline events
return their bytes as-is, blob-ref events go through this
file’s configured super::super::dataforts::blob::BlobAdapter
(looked up by config.blob_adapter_id in the global
registry) and have their BLAKE3 hash verified.
Returns:
Ok(None)—seqis not retained / never appended.Ok(Some(bytes))— resolved payload.Err(BlobError::*)— payload was a malformed BlobRef, the configured adapter id is unset / missing in the registry, the adapter’s fetch failed, or the hash check failed.
Sourcepub fn sweep_retention(&self)
pub fn sweep_retention(&self)
Run the retention policy synchronously. Exposed so a background task (heartbeat loop) can drive it; no hot-path cost.
§Disk I/O under parking_lot Mutex
sweep_retention and append_batch call disk operations
(disk.compact_to, disk.append_entries_at) while
holding state.lock(), a non-yielding parking_lot Mutex.
All concurrent appenders, tail registrations,
read_range, len, is_empty, and close block on the
same lock for the duration of the I/O. This is a
latency / starvation concern, not a correctness one.
Throughput-critical deployments should drive
sweep_retention from a low-priority background task
scheduled outside the hot path.
A proper fix would snapshot the state needed for I/O, release the lock, perform I/O, then re-acquire to commit — that’s a substantial restructure (transactional staging area, conflict resolution against concurrent appends) and out of scope here. Documented as a known performance trade-off so future profilers don’t rediscover it as a “bug”.
Sourcepub fn skip_to(&self, target_seq: u64) -> Result<(), RedexError>
pub fn skip_to(&self, target_seq: u64) -> Result<(), RedexError>
Skip-ahead: drop every retained event with seq < target_seq
and advance next_seq to target_seq. The next append
allocates target_seq; the local sequence has a permanent
gap in [old_next_seq, target_seq) (those seqs were never
assigned on this file).
Used by the replication runtime’s apply_sync_response
failure path (ApplyError::GapBeforeChunk): when the
leader’s response carries a first_seq strictly above the
replica’s local tail (the leader trimmed past the replica’s
retained range), the replica skips ahead instead of
replaying the missing gap. Plan §8.
Idempotent on target_seq <= current_next_seq — returns
Ok(()) without mutation.
Persistent files: returns
RedexError::Channel("skip_to not supported on persistent files in v1"). The persistent-tier skip-ahead path needs a
disk-segment truncate plus idx/ts rebuild that doesn’t ship
in v1; replicas on persistent files fall back to NACK
BadRange and heartbeat-cycle recovery (the leader’s next
heartbeat reports the new tail, the replica re-issues a
request from its old tail, and accepts the gap-before-chunk
error as a recoverable retry-with-backoff signal).
Sourcepub fn close(&self) -> Result<(), RedexError>
pub fn close(&self) -> Result<(), RedexError>
Close the file. Outstanding tail streams receive RedexError::Closed.
For persistent files, fsyncs the disk segment before returning
and signals any background fsync task (Interval or EveryN) to
exit. close() always fsyncs regardless of the per-file
FsyncPolicy — this is the caller’s explicit durability
barrier.
Sourcepub fn sync(&self) -> Result<(), RedexError>
pub fn sync(&self) -> Result<(), RedexError>
Fsync the disk segment (no-op for heap-only files).