pub struct RecordLog { /* private fields */ }Expand description
Append-only framed record log.
§Failure model
Mutating operations (append, append_record, fsync, rotate) may
fail in the middle of an I/O operation, after the kernel has accepted
part of a frame but before the whole frame is on disk (ENOSPC,
a broken disk, a torn write). When that happens the log handle enters
a poisoned state:
- Every subsequent mutating call returns a deterministic error whose
message starts with
datawal: writer poisoned:and ends with; drop handle and reopen. The error is intentionally a plainanyhow::Errorin 0.1.x; promotion to a typed error variant is tracked for a future minor release. - Read-only operations (
scan,scan_iter,recovery_report,active_segment,dir) remain available so the caller can inspect state before dropping the handle. - The caller must drop the handle and re-open the directory with
RecordLog::open. Reopen uses the standard longest-valid-prefix recovery (see invariant 2 inAGENTS.md) and will discard any partial tail bytes left behind by the failed write.
The crate intentionally does not try to truncate the partial tail or
resync active_size on the live handle. Both are forms of mutating
state after a write failure, which expands rather than contains the
blast radius.
Implementations§
Source§impl RecordLog
impl RecordLog
Sourcepub fn open(dir: &Path) -> Result<Self>
pub fn open(dir: &Path) -> Result<Self>
Open (or create) a record log rooted at dir.
Steps:
mkdir -p dir.- Acquire an exclusive OS-level advisory lock on
<dir>/.lock(held by a file descriptor; released automatically when thisRecordLogis dropped or when the holding process exits). - Discover segments; if none, create segment id 1.
- Pick the highest id as the active segment.
- Scan all segments to discover
next_txidand store the recovery report. - Open the active segment for append.
Fails fast if another RecordLog is already open on the same
directory (the kernel-level lock acquisition does not block).
Sourcepub fn is_poisoned(&self) -> bool
pub fn is_poisoned(&self) -> bool
Returns true if the writer is poisoned by a prior I/O failure.
A poisoned log refuses all further mutating operations. Read-only operations remain available so the caller can inspect state before dropping the handle. See the type-level “Failure model” docs.
Sourcepub fn active_segment(&self) -> u32
pub fn active_segment(&self) -> u32
Active segment id.
Sourcepub fn recovery_report(&self) -> Result<RecoveryReport>
pub fn recovery_report(&self) -> Result<RecoveryReport>
Last recovery report computed by open() or scan().
Sourcepub fn append(&mut self, payload: &[u8]) -> Result<RecordRef>
pub fn append(&mut self, payload: &[u8]) -> Result<RecordRef>
Append an opaque payload as a Raw record.
Durability boundary. This call writes a framed, CRC-protected
record to the active segment’s file via write_all. It does not
fsync the file or the directory. The record is recoverable (a
subsequent scan() will return it) as long as the OS does not lose
the buffered write, but it is not yet durable across a power
failure or hard crash of the host until fsync() returns
successfully.
Pattern for “this must survive a crash”:
log.append(payload)?;
log.fsync()?;Sourcepub fn append_record(
&mut self,
record_type: RecordType,
key: &[u8],
payload: &[u8],
) -> Result<RecordRef>
pub fn append_record( &mut self, record_type: RecordType, key: &[u8], payload: &[u8], ) -> Result<RecordRef>
Append a typed record with a key and a payload.
Used by crate::DataWal for Put / Delete. Length limits are
validated by the encoder before allocation.
Same durability semantics as RecordLog::append: framed and
recoverable on return, but only durable after a successful
RecordLog::fsync.
Sourcepub fn scan(&mut self) -> Result<Vec<Record>>
pub fn scan(&mut self) -> Result<Vec<Record>>
Scan every segment in order and return every valid record.
Materialises every record into a Vec<Record>. For logs with many
records or large payloads, prefer RecordLog::scan_iter which
yields one record at a time without materialising the whole log.
Also refreshes recovery_report() and the internal next_txid.
Sourcepub fn scan_iter(&self) -> Result<RecordIter<'_>>
pub fn scan_iter(&self) -> Result<RecordIter<'_>>
Returns an iterator over records.
This is lazy at the record level: callers can pull one record at a
time without materialising the whole log into a Vec<Record>. It
is not a chunked or zero-copy scanner — v0.1 loads one segment
at a time into memory before yielding records from it. Peak memory
is therefore bounded by the size of the largest segment, not by
the total log size.
Recovery semantics match RecordLog::scan:
- A truncated or CRC-bad tail on the last segment is tolerated
and ends iteration cleanly. The amount of trailing garbage
discarded is reflected in
RecordIter::recovery_report. - Any structural decode error, or any CRC/truncation problem in a
sealed (non-last) segment, is yielded as an
Erritem; iteration ends after that error, and the underlying error is the sameanyhowerror thatRecordLog::scanwould have returned.
This method takes &self. It does not refresh the log’s own
recovery_report() or next_txid — only RecordLog::scan does
that.
Aborting iteration early (by dropping the iterator before exhaustion) is supported and has no on-disk side effects.
Sourcepub fn fsync(&mut self) -> Result<()>
pub fn fsync(&mut self) -> Result<()>
Force durability of all records appended so far.
On successful return, every record passed to append /
append_record since this RecordLog was opened (or since the last
fsync returned) is durable: it will survive a process crash,
kernel panic or power loss on the underlying disk, modulo the
usual filesystem caveats (working fsync syscall, no lying disk
cache).
Internally this calls File::sync_all on the active segment and
fsync on the containing directory, so that segment creations and
rotations are also durable.
fsync may be called as often as desired; on a log with no new
appends since the last fsync it is effectively a no-op at the
kernel level, but it is always safe.