Expand description
B+tree node encode / decode.
See docs/format.md § B+tree for the on-disk layout. This module
is the reference implementation; the tests at the bottom validate
round-trip equality and rejection of every documented invariant
violation.
§Power-of-ten posture
-
Rule 2. Every loop iterates over
key_count, the slot directory length, or the page-byte range — all bounded byPAGE_SIZE. -
Rule 3. The in-memory
DecodedNodeusesVecs, notheapless::Vecs of the worst-case slot count: the worst-case slot count times the worst-case per-key buffer (~1 MiB) would overflow a 2 MiB stack. The B+tree’s traversal stack usesheapless::Vecto satisfy Rule 3 on the hot read path; the decoded representation of a single node is a transient per-page artifact whose heap allocation is unavoidable and is not on a real-time path.Phase 7A perf-audit note (extended). The initial Power-of-Ten audit flagged the three
Vecs inDecodedNode(children,leaves,internals) as a hot-path allocation cost. Inspection shows the actual cost is far smaller than it looks and aheapless-conversion is not worth shipping. Four reasons: (1)Vec::new()does not allocate — the firstreservedoes — so a leaf decode allocates exactly ONE outerVec(theleavesspine) and an internal decode allocates TWO (children+internals); the unused vecs stay at zero capacity. (2) Each leaf entry’skeyandvalueare themselvesVec<u8>; for an N-entry leaf that’s 2N inner allocations vs. the 1 outer, so converting the outer saves at best 1/(2N+1) on a real workload (~0.5 % on a 100-entry leaf). (3)LEAF_SLOT_CAP ≈ 1017andsizeof(LeafEntry) = 48— a stack-allocatedheapless::Vec<LeafEntry, LEAF_SLOT_CAP>adds ~48 KiB to everydecode_nodeframe, a stack-overflow hazard on small-stack targets (embedded, fiber runtimes) and a 48 KiB zero-write per call under debug + stack-protectors. The conversion makes the hot path worse in latency-sensitive deployments. (4)index_lookup(warm-cache point read) spends most of its time inpager.read_page(page-cache hash lookup), CRC32C verification of the page trailer, and the descend stack — not inVec::reserve. A 100k-leafcollection_scandoes ~100 000 outer allocs vs. ~50 000 000 inner allocs (~500 bytes/doc with 2 innerVecs per entry); the outer count is a noise floor.The conclusion: the outer
Vecs stay. A real perf win fordecode_noderequires attacking the inner-vec allocations (e.g. decode-in-place with&[u8]slices into the page buffer, borrowing the pager’s page-cache buffer for the iterator’s lifetime), which is a format-aware refactor of the node API and is out of scope for Phase 7 polish. -
Rule 5.
validate_nodeis the central invariant check; it isdebug_assert!-ed at every mutation site and surfaces asError::Corruptionon decode of a malformed page. -
Rule 7. No
unwrap/expecton any error-bearing path; length-prefixed varint reads returnError::Corruptionon underflow.
Structs§
- Decoded
Node - Decoded in-memory representation of a B+tree node.
- Internal
Entry - One internal-node pivot entry in the decoded in-memory
representation. The corresponding right child is stored in
DecodedNode::childrenat indexi + 1. - Leaf
Entry - One leaf slot in the decoded in-memory representation.
Enums§
- Node
Kind - Whether a node is an internal node or a leaf.
Constants§
- INTERNAL_
LEFTMOST_ CHILD_ BYTES - Bytes the leftmost-child page-id occupies in an internal node, immediately after the node header.
- INTERNAL_
SLOT_ BYTES - Bytes per slot in an internal slot directory. The entry is a u32 LE offset followed by a u64 LE right-child page-id.
- INTERNAL_
SLOT_ CAP - Maximum number of slots an internal node can hold at the worst case. Used to size the in-memory slot vector.
- LEAF_
SLOT_ BYTES - Bytes per slot in a leaf slot directory. The entry is a single u32 LE giving the byte offset of the corresponding heap entry.
- LEAF_
SLOT_ CAP - Maximum number of slots a leaf can hold at the worst-case empty key/value sizes. Used to size the in-memory slot vector.
- NODE_
HEADER_ SIZE - Fixed-size B+tree node header. See
docs/format.md§ Node header. - PAGE_
TYPE_ BTREE_ INTERNAL - Page-type tag for an internal B+tree node. See
docs/format.md§ Page types. - PAGE_
TYPE_ BTREE_ LEAF - Page-type tag for a leaf B+tree node.
Functions§
- decode_
node - Decode a B+tree page into a
DecodedNode. - decode_
node_ find - Find-only leaf decode: scan a leaf page for
targetand copy out ONLY the matched value as an ownedVec<u8>— never the whole 2N-entry node. - encode_
node - Encode
nodeintopage. Stamps the page trailer. - max_
inline_ value - Maximum value length that still fits inline in a leaf alongside
at least one slot. See
docs/format.md§ Key and value encoding. - max_
key_ len - Maximum key length the format permits. See
docs/format.md§ Key and value encoding. - peek_
node_ kind - Read only the node-kind tag from a B+tree page header.
- validate_
node - Validate
nodeagainst the format-version-0 invariants. - varint_
len - Number of bytes an unsigned LEB128 varint would use for
v.
Type Aliases§
- Child
Entry - Alias for “one of the right-child page-ids in an internal node”.
We store children as raw
u64s; the on-disk format permits0only as the leafnext_siblingsentinel, never as an internal child pointer.