Skip to main content

Module node

Module node 

Source
Expand description

B+tree node encode / decode.

See docs/format.md § B+tree for the on-disk layout. This module is the reference implementation; the tests at the bottom validate round-trip equality and rejection of every documented invariant violation.

§Power-of-ten posture

  • Rule 2. Every loop iterates over key_count, the slot directory length, or the page-byte range — all bounded by PAGE_SIZE.

  • Rule 3. The in-memory DecodedNode uses Vecs, not heapless::Vecs of the worst-case slot count: the worst-case slot count times the worst-case per-key buffer (~1 MiB) would overflow a 2 MiB stack. The B+tree’s traversal stack uses heapless::Vec to satisfy Rule 3 on the hot read path; the decoded representation of a single node is a transient per-page artifact whose heap allocation is unavoidable and is not on a real-time path.

    Phase 7A perf-audit note (extended). The initial Power-of-Ten audit flagged the three Vecs in DecodedNode (children, leaves, internals) as a hot-path allocation cost. Inspection shows the actual cost is far smaller than it looks and a heapless-conversion is not worth shipping. Four reasons: (1) Vec::new() does not allocate — the first reserve does — so a leaf decode allocates exactly ONE outer Vec (the leaves spine) and an internal decode allocates TWO (children + internals); the unused vecs stay at zero capacity. (2) Each leaf entry’s key and value are themselves Vec<u8>; for an N-entry leaf that’s 2N inner allocations vs. the 1 outer, so converting the outer saves at best 1/(2N+1) on a real workload (~0.5 % on a 100-entry leaf). (3) LEAF_SLOT_CAP ≈ 1017 and sizeof(LeafEntry) = 48 — a stack-allocated heapless::Vec<LeafEntry, LEAF_SLOT_CAP> adds ~48 KiB to every decode_node frame, a stack-overflow hazard on small-stack targets (embedded, fiber runtimes) and a 48 KiB zero-write per call under debug + stack-protectors. The conversion makes the hot path worse in latency-sensitive deployments. (4) index_lookup (warm-cache point read) spends most of its time in pager.read_page (page-cache hash lookup), CRC32C verification of the page trailer, and the descend stack — not in Vec::reserve. A 100k-leaf collection_scan does ~100 000 outer allocs vs. ~50 000 000 inner allocs (~500 bytes/doc with 2 inner Vecs per entry); the outer count is a noise floor.

    The conclusion: the outer Vecs stay. A real perf win for decode_node requires attacking the inner-vec allocations (e.g. decode-in-place with &[u8] slices into the page buffer, borrowing the pager’s page-cache buffer for the iterator’s lifetime), which is a format-aware refactor of the node API and is out of scope for Phase 7 polish.

  • Rule 5. validate_node is the central invariant check; it is debug_assert!-ed at every mutation site and surfaces as Error::Corruption on decode of a malformed page.

  • Rule 7. No unwrap / expect on any error-bearing path; length-prefixed varint reads return Error::Corruption on underflow.

Structs§

DecodedNode
Decoded in-memory representation of a B+tree node.
InternalEntry
One internal-node pivot entry in the decoded in-memory representation. The corresponding right child is stored in DecodedNode::children at index i + 1.
LeafEntry
One leaf slot in the decoded in-memory representation.

Enums§

NodeKind
Whether a node is an internal node or a leaf.

Constants§

INTERNAL_LEFTMOST_CHILD_BYTES
Bytes the leftmost-child page-id occupies in an internal node, immediately after the node header.
INTERNAL_SLOT_BYTES
Bytes per slot in an internal slot directory. The entry is a u32 LE offset followed by a u64 LE right-child page-id.
INTERNAL_SLOT_CAP
Maximum number of slots an internal node can hold at the worst case. Used to size the in-memory slot vector.
LEAF_SLOT_BYTES
Bytes per slot in a leaf slot directory. The entry is a single u32 LE giving the byte offset of the corresponding heap entry.
LEAF_SLOT_CAP
Maximum number of slots a leaf can hold at the worst-case empty key/value sizes. Used to size the in-memory slot vector.
NODE_HEADER_SIZE
Fixed-size B+tree node header. See docs/format.md § Node header.
PAGE_TYPE_BTREE_INTERNAL
Page-type tag for an internal B+tree node. See docs/format.md § Page types.
PAGE_TYPE_BTREE_LEAF
Page-type tag for a leaf B+tree node.

Functions§

decode_node
Decode a B+tree page into a DecodedNode.
decode_node_find
Find-only leaf decode: scan a leaf page for target and copy out ONLY the matched value as an owned Vec<u8> — never the whole 2N-entry node.
encode_node
Encode node into page. Stamps the page trailer.
max_inline_value
Maximum value length that still fits inline in a leaf alongside at least one slot. See docs/format.md § Key and value encoding.
max_key_len
Maximum key length the format permits. See docs/format.md § Key and value encoding.
peek_node_kind
Read only the node-kind tag from a B+tree page header.
validate_node
Validate node against the format-version-0 invariants.
varint_len
Number of bytes an unsigned LEB128 varint would use for v.

Type Aliases§

ChildEntry
Alias for “one of the right-child page-ids in an internal node”. We store children as raw u64s; the on-disk format permits 0 only as the leaf next_sibling sentinel, never as an internal child pointer.