pub struct Database<'a> { /* private fields */ }Expand description
Top-level handle to a buffered NSF file.
Holds a borrowed slice of the full file bytes. Cheap to construct - no copies are made. The parser walks the file lazily; consumers pay for what they enumerate.
Implementations§
Source§impl<'a> Database<'a>
impl<'a> Database<'a>
Sourcepub fn open(bytes: &'a [u8]) -> Result<Self, NsfError>
pub fn open(bytes: &'a [u8]) -> Result<Self, NsfError>
Open an NSF from a full-file byte buffer. Validates the file header and DBINFO; lazy on everything else.
Sourcepub fn has_data_rrv(&self) -> bool
pub fn has_data_rrv(&self) -> bool
True when the database carries a populated data RRV bucket. A fresh / never-instantiated template will return false here - it has design notes via the non-data RRV but no data notes.
Sourcepub fn data_rrv_iter(
&self,
) -> Result<Option<(RrvBucketHeader, RrvIter<'a>)>, NsfError>
pub fn data_rrv_iter( &self, ) -> Result<Option<(RrvBucketHeader, RrvIter<'a>)>, NsfError>
Parse + iterate the data RRV bucket if present. Returns the bucket header for diagnostics plus an iterator over the non-empty RRV entries.
The data RRV bucket’s file position is reported in 256-byte
units in DBINFO; this method converts to a byte offset and
reads rrv_bucket_size bytes from that point.
Sourcepub fn data_note_count(&self) -> Result<u64, NsfError>
pub fn data_note_count(&self) -> Result<u64, NsfError>
Convenience: count non-empty entries in the data RRV. Walks the bucket but does not retain the per-entry state.
Sourcepub fn has_non_data_rrv(&self) -> bool
pub fn has_non_data_rrv(&self) -> bool
True when the database carries a populated non-data RRV bucket.
Design notes (forms, views) and, in databases like fakenames.nsf,
the bulk of document notes are reached through the non-data RRV
rather than the data RRV.
Sourcepub fn non_data_rrv_iter(
&self,
) -> Result<Option<(RrvBucketHeader, RrvIter<'a>)>, NsfError>
pub fn non_data_rrv_iter( &self, ) -> Result<Option<(RrvBucketHeader, RrvIter<'a>)>, NsfError>
Parse + iterate the non-data RRV bucket if present. Mirrors
Self::data_rrv_iter but reads from
non_data_rrv_bucket_position. Most bucket-slot RRV entries (the
ones Self::resolve_bucket_slot resolves) live here.
Sourcepub fn data_rrv_take(&self, limit: usize) -> Result<Vec<RrvEntry>, NsfError>
pub fn data_rrv_take(&self, limit: usize) -> Result<Vec<RrvEntry>, NsfError>
Collect at most limit RRV entries from the data RRV for
preview / list rendering. Useful for “show the first 200 notes
in the viewer” without walking 40,000 entries up front.
Sourcepub fn information2(&self) -> Result<Information2, NsfError>
pub fn information2(&self) -> Result<Information2, NsfError>
Parse the database information extension block 2 (file offset 520, 124 bytes). Carries the 4 superblock positions + 2 BDB positions plus bucket-size knobs.
Sourcepub fn superblocks(&self) -> Result<Vec<(usize, Superblock)>, NsfError>
pub fn superblocks(&self) -> Result<Vec<(usize, Superblock)>, NsfError>
Parse every populated superblock copy (skipping uninitialized
slots). Each entry is (slot_index, Superblock) so callers can
report which copy was loaded. Domino allocates 4 slots and rotates
commits across them; instantiated databases typically have 3
populated and 1 empty, with the freshest by modification_time
authoritative (use Self::freshest_superblock).
Forensic-tool-grade resilience: slots are skipped silently when any of these conditions hold, rather than crashing the load:
- Slot is empty (position or size zero).
- Slot’s declared byte offset extends past the file end.
- Slot’s body does not start with the superblock signature
0E 00. This catches fresh-template uninitialized regions that Domino allocates withallocation_granularitybut never commits to (empirically these are filled withAA AA AA AA, e.g. SB3 ofcomparedbs.ntf).
Other parse failures (e.g. unexpected short read mid-header) are not expected in practice with a fully-buffered NSF and would surface as errors. The 3-redundant-copy WAL guarantees that silently dropping an unreadable slot leaves at least one valid copy.
Sourcepub fn freshest_superblock(
&self,
) -> Result<Option<(usize, Superblock)>, NsfError>
pub fn freshest_superblock( &self, ) -> Result<Option<(usize, Superblock)>, NsfError>
Convenience: parse all populated superblocks and return the
freshest one by modification_time. The other three copies are
write-ahead-log redundancy and should be ignored once this one
is loaded. Returns None if no superblock slots are populated
(extremely rare; would indicate a partially-initialized NSF).
Sourcepub fn decompressed_superblock_body(&self) -> Result<Option<Vec<u8>>, NsfError>
pub fn decompressed_superblock_body(&self) -> Result<Option<Vec<u8>>, NsfError>
Decompress the freshest superblock’s body (the CX-compressed region
that carries the bucket-descriptor array). Returns None when the
database has no superblock.
Body layout from the superblock byte offset, per the reference:
[0,100) header, then the compressed region of length
size - 112 (100-byte header + 12-byte footer removed), of which
the first 4 bytes are a prefix the decompressor skips. The
decompressed length is the header’s uncompressed_size field.
Sourcepub fn summary_bucket_offsets(&self) -> Result<Vec<u64>, NsfError>
pub fn summary_bucket_offsets(&self) -> Result<Vec<u64>, NsfError>
Build the global summary-bucket descriptor map: a 0-based vector of
file byte offsets where offsets[bucket_index - 1] is the byte
offset of the summary bucket an RRV bucket-slot entry’s
bucket_index refers to (bucket_index is 1-based on disk).
§Multi-page geometry
On modern ODS the summary bucket descriptors are spread across
number_of_summary_bucket_descriptor_pages pages. The decompressed
superblock body begins with a page index of (pages - 1) stride-14
records (the page’s file_position is the first 4 bytes of each
record); those point to the out-of-body pages. The final (resident)
page’s descriptor array is inline in the body at
SUMMARY_RESIDENT_PREFIX + (pages - 1) * SUMMARY_DESCRIPTOR_BYTES.
Single-page databases (pages <= 1) have only the resident page at
the libnsfdb-documented offset 224.
libnsfdb itself only handles a single descriptor page (it errors on
> 1), so the multi-page geometry here was reverse-engineered and
validated against the rrv_identifier identity oracle (see
Self::enumerate_notes). The out-of-body page header size
([OUT_OF_BODY_PAGE_HEADER]) and per-page descriptor count
([PER_OUT_OF_BODY_PAGE]) are empirical constants; mis-fits surface
as identity-gate failures in Self::enumerate_notes rather than as
silently wrong records.
Sourcepub fn resolve_bucket_slot(
&self,
bucket_index: u32,
slot_index: u16,
) -> Result<&'a [u8], NsfError>
pub fn resolve_bucket_slot( &self, bucket_index: u32, slot_index: u16, ) -> Result<&'a [u8], NsfError>
Resolve a single RRV bucket-slot pair to the raw bytes of the slot’s record, using the summary-bucket descriptor map.
This is the physical resolution step: it does not identity-check the
result. For verified note enumeration (where each resolved record is
confirmed to carry the requested rrv_identifier), use
Self::enumerate_notes. Rebuilds the descriptor map on each call;
callers resolving many entries should prefer enumerate_notes, which
builds the map once.
Sourcepub fn bucket_descriptor_block(
&self,
) -> Result<Option<BucketDescriptorBlock>, NsfError>
pub fn bucket_descriptor_block( &self, ) -> Result<Option<BucketDescriptorBlock>, NsfError>
Parse the freshest Bucket Descriptor Block (BDB) - the master index
of every RRV bucket in the database. Returns None when no BDB slot
is populated (a fresh / never-instantiated shell). Of the two BDB
copies in Information2 (primary + write-ahead-log redundancy) the
one with the higher write_count is authoritative.
Sourcepub fn enumerate_notes(&self) -> Result<NoteEnumeration, NsfError>
pub fn enumerate_notes(&self) -> Result<NoteEnumeration, NsfError>
Enumerate every note in the database by walking the BDB -> all RRV buckets -> each RRV entry, resolving each to a note record.
Every resolution is identity-gated: a note is only accepted if
the resolved record’s rrv_identifier (note header offset 6) equals
the RRV entry’s identifier. This is the chain-of-custody guarantee -
a record is never returned unless it provably is the note the RRV
entry points to. Entries that no candidate resolves under the gate
are counted in unresolved rather than returned as possibly-wrong
evidence.
§Group-marker recovery
A small set of summary-descriptor slots (the page’s group-boundary
slots) carry group-marker flag bits inside the file_position field:
the low nibble, or bits 16-19 (in which case the true high nibble
matches the locally-sequential neighbours). For each bucket-slot
entry the resolver tries the raw descriptor first, then these
marker-corrected candidates, accepting the first that passes the
identity gate. Because acceptance requires an exact 32-bit
rrv_identifier match, a wrong candidate cannot be accepted - the
recovery is heuristic in what it tries but never in what it
returns.
Sourcepub fn non_summary_data(&self, note: &ResolvedNote) -> Option<&'a [u8]>
pub fn non_summary_data(&self, note: &ResolvedNote) -> Option<&'a [u8]>
Return a note’s non-summary data object - the separately-stored
large payload that holds rich-text ($Body / mail bodies), file
attachments (OBJECT items), and other items too big for the inline
summary. None when the note has no non-summary data.
Location: non_summary_data_identifier << 8 is the byte offset of
the object, which opens with a header - signature 0x0010, then a
u32 size and the owning note’s u32 rrv_identifier (both validated
here) - followed by the payload (a CD-record stream for rich text, or
object segments for attachments). The returned slice is the whole
object including that header; record-level decoding (CD records,
attachment extraction) is a later slice.
Sourcepub fn note_content(&self, note: &ResolvedNote) -> Option<NoteContent>
pub fn note_content(&self, note: &ResolvedNote) -> Option<NoteContent>
Decode a note’s rich-text body and attachments from its non-summary
data (CD-record stream). Returns None when the note has no
non-summary data or it decodes to nothing. See crate::cd.
Sourcepub fn note_items(&self, note: &ResolvedNote) -> Vec<NoteItem<'a>>
pub fn note_items(&self, note: &ResolvedNote) -> Vec<NoteItem<'a>>
Parse the items (fields) of a resolved note: each item’s name id,
type/flags, and raw value bytes. See crate::item for the layout
and what is / isn’t decoded (field-name resolution is a later slice).
The record window is bounded to the note’s declared size so item
values cannot read into a neighbouring record.