Struct Tree

Source

pub struct Tree {
    pub key_comparator: Option<KeyComparatorFn>,
    pub memory_counter: Option<Arc<AtomicI64>>,
    pub in_list_listener: Option<Arc<dyn InListListener>>,
    pub log_manager: Option<Arc<LogManager>>,
    pub key_prefixing: bool,
    pub compact_max_key_length: i32,
    /* private fields */
}

Expand description

The B+tree.

This is the main tree structure that manages the B+tree nodes and provides operations for search, insert, delete, and tree maintenance.

Fields§

§key_comparator: Option<KeyComparatorFn>

Optional custom key comparator for sorted-duplicate databases.

When Some, all key comparisons in tree traversal (upper IN routing and BIN entry search/insert/delete) use this comparator instead of lexicographic byte comparison.

/ dupComparator stored on the database and consulted at every IN.findEntry() call.

§memory_counter: Option<Arc<AtomicI64>>

Shared memory counter for the evictor / MemoryBudget.

Updated on every BIN entry insert (+key+data+overhead) and delete (-key+overhead) so the evictor sees real cache pressure.

env.getMemoryBudget().updateTreeMemoryUsage(delta) call in the equivalent IN.updateMemorySize(). In Noxu the counter is an Arc<AtomicI64> shared with the Arbiter (and later MemoryBudget) to avoid a circular crate dependency (noxu-tree → noxu-dbi).

§in_list_listener: Option<Arc<dyn InListListener>>

Optional listener fed on node add/access/remove, mirroring JE’s INList feeding the evictor’s LRULists.

When None (the default — used by unit tests with no environment), the notifications are no-ops. EnvironmentImpl installs the Evictor here so production inserts/accesses populate the LRU lists the evictor drains.

JE reference: IN.fetchTarget/split/rebuildINList → addBack, access → moveBack, removal → remove.

§log_manager: Option<Arc<LogManager>>

Optional log manager so an evicted root IN can be re-materialized from its persisted root_log_lsn on the next access (EV-14, piece B).

JE’s Tree reaches the log via database.getEnv().getLogManager(); Tree.getRootINRootAlreadyLatched calls root.fetchTarget(...) which reads the root IN back from its ChildReference LSN when the in-memory target is null (Tree.java:477-516, ChildReference.fetchTarget). Noxu has no env back-reference here, so the log manager is installed directly (the same one-way wiring as in_list_listener). When None (unit tests with no environment), an evicted root cannot be re-fetched — but evict_root refuses to evict without a log manager, so the root is never made non-resident in that configuration.

§key_prefixing: bool

Whether key-prefix compression is enabled for this tree’s BINs.

JE DatabaseImpl.getKeyPrefixing() / DatabaseConfig.setKeyPrefixing(). When false, IN.computeKeyPrefix returns null in JE — no prefix is ever set. Noxu mirrors this: insert_with_prefix is skipped in favour of insert_raw, and recompute_key_prefix is not called on BIN halves after a split.

Default: false (matches JE’s DatabaseConfig.KEY_PREFIXING_DEFAULT).

Ref: IN.java computeKeyPrefix ~line 2456.

§compact_max_key_length: i32

T-5: maximum post-prefix key length (bytes) for the compact key rep (INKeyRep.MaxKeySize). A node packs all its keys into one fixed-width byte array when every post-prefix key is <= this length; a longer key inflates the node to the Default rep. <= 0 disables the compact rep entirely.

Default 16 (TREE_COMPACT_MAX_KEY_LENGTH / INKeyRep.MaxKeySize.DEFAULT_MAX_KEY_LENGTH). Wired from EnvironmentConfig via Tree::set_compact_max_key_length (IN.getCompactMaxKeyLength, IN.java:4929).

Implementations§

Source §

impl Tree

Source

pub fn new(database_id: u64, max_entries_per_node: usize) -> Self

Creates a new empty tree.

Constructor.

Source

pub fn set_memory_counter(&mut self, counter: Arc<AtomicI64>)

Installs a shared memory counter for evictor / MemoryBudget feedback.

→ env.getMemoryBudget().updateTreeMemoryUsage(delta) . The counter is updated on every BIN entry insert/delete.

Source

pub fn set_in_list_listener(&mut self, listener: Arc<dyn InListListener>)

Installs the InListListener (the evictor) so node add/access/remove feed the LRU lists. JE: INList registration that feeds Evictor.addBack/moveBack/remove.

Source

pub fn set_log_manager(&mut self, lm: Arc<LogManager>)

Installs the noxu_log::LogManager so an evicted root IN can be re-materialized from its persisted LSN on the next access (EV-14).

JE: the tree reaches the log through database.getEnv().getLogManager() for ChildReference.fetchTarget. Noxu installs it directly.

Source

pub fn clear_log_manager(&mut self)

Drops this tree’s Arc<LogManager> reference (EV-14 teardown).

The env’s Drop calls this on every tree it owns so the Tree -> Arc<LogManager> -> Arc<FileManager> chain cannot keep the FileManager (and its on-disk exclusive lock) alive past environment close. After this the tree can no longer re-fetch an evicted root from the log — which is correct, because the environment is shutting down and the tree is about to be dropped.

Source

pub fn set_compact_max_key_length(&mut self, len: i32)

T-5: set the compact-key threshold (TREE_COMPACT_MAX_KEY_LENGTH / IN.getCompactMaxKeyLength). New BINs created by this tree inherit it; <= 0 disables the compact key rep. Default 16.

Source

pub fn new_with_comparator( database_id: u64, max_entries_per_node: usize, comparator: KeyComparatorFn, ) -> Self

Creates a new empty tree with a custom key comparator.

Used for sorted-duplicate databases where keys are two-part composite keys that require a custom ordering function.

Constructor with btreeComparator parameter.

Source

pub fn set_key_prefixing(&mut self, enabled: bool)

Sets the key-prefixing flag.

When true, BIN key-prefix compression is enabled: shared leading bytes are factored out of each slot’s key. When false (the default), keys are stored verbatim — matching JE DatabaseConfig.setKeyPrefixing(false) / IN.computeKeyPrefix returning null.

Ref: IN.java computeKeyPrefix ~line 2456.

Source

pub fn set_comparator(&mut self, comparator: KeyComparatorFn)

Sets the key comparator, replacing any existing one.

Source

pub fn hint_redo_capacity(&mut self, capacity: usize)

Store a capacity hint used by redo_insert when it creates the first BIN for this tree (the first-key path).

The first BIN’s entries Vec is pre-allocated with capacity.min(max_entries_per_node) slots, eliminating the Vec-resize doubling cycle (1 → 2 → 4 → … → cap) that would otherwise occur during the redo loop.

Call once before the redo loop. Has no effect on insert (the normal, non-recovery path).

Wave 11-K optimisation (Fix 3).

Source

pub fn get_redo_capacity_hint(&self) -> usize

Returns the current redo capacity hint (0 = no hint set).

Source

pub fn take_comparator(&mut self) -> Option<KeyComparatorFn>

Takes the key comparator out of this tree (leaving None).

Source

pub fn get_comparator(&self) -> Option<&KeyComparatorFn>

Returns a reference to the key comparator, if configured.

Used by CursorImpl::find_bin_for_key (R4 fix) so the cursor’s own IN-level descent uses the same comparator-aware floor slot as the tree’s own search paths. Mirrors JE DatabaseImpl.getKeyComparator().

Source

pub fn is_empty(&self) -> bool

Returns true if the tree has no root (is empty).

Source

pub fn set_root(&self, node: TreeNode)

Sets the root of the tree.

Must hold root_latch exclusively before calling.

Source

pub fn get_root(&self) -> Option<Arc<RwLock<TreeNode>>>

Returns the root Arc, if any.

Returns a cloned Arc rather than a reference so the caller does not hold the inner RwLock guard.

EV-14: when the in-memory root has been evicted (evict_root) but a persisted version exists (root_log_lsn set), this re-materializes it from the log before returning — the faithful equivalent of JE Tree.getRootIN always calling root.fetchTarget(...). Returns None only for a genuinely empty tree (no resident root and no persisted root LSN).

Source

pub fn get_database_id(&self) -> u64

Returns the database ID.

Source

pub fn count_entries(&self) -> u64

Count the total number of live (non-deleted) entries across all BINs.

Used by DatabaseImpl::set_recovered_tree() to initialise the per-database entry_count AtomicU64 after recovery replays the log.

Source

pub fn resort_under_comparator(&self)

DBI-14: rebuild this tree so that its on-disk byte-ordered slot layout is re-sorted under the currently-configured key comparator.

Recovery redo (redo_insert) has no access to the application’s comparator function — only the persisted identity — so it lays keys out in unsigned-byte order. After set_recovered_tree attaches the real comparator, the slots must be re-sorted, or comparator-driven searches would binary-search a tree ordered by the wrong relation.

No-op when no comparator is configured (byte order already matches the recovered layout) or when the tree is empty. Mirrors the effect of JE reconstructing the comparator at open and the tree always having been built under it.

Source

pub fn total_budgeted_memory(&self) -> u64

Sum the real in-memory heap footprint of every resident node in the tree (DBI-23 oracle / reconciliation), in bytes.

Walks all resident IN/BIN nodes and adds each node’s budgeted_memory_size (JE IN.getBudgetedMemorySize). This is the authoritative “real heap” figure the incrementally-maintained memory_counter is meant to approximate; an engine can call it to reconcile counter drift, and the DBI-23 test uses it as the oracle the live counter must stay within tolerance of.

Source

pub fn search(&self, key: &[u8]) -> Option<SearchResult>

Search for a BIN that should contain the given key.

This is the core tree traversal operation. It walks from root to BIN using latch-coupling (acquire child latch, then release parent latch).

. Descends the tree until a BIN is reached, following the child pointer at the slot whose key is the largest key <= the search key (the “LTE” rule). Slot 0 in every upper IN carries a virtual key (-infinity) so any search key routes through it when all real keys are larger.

Returns a SearchResult indicating where the key is or should be. Returns None if tree is empty.

Source

pub fn search_with_data(&self, key: &[u8]) -> Option<SlotFetch>

Combined search-and-fetch: descend once to the BIN and return the slot’s data together with a reference to the BIN arc.

Replaces the previous three-descent sequence on the Database::get hot path:

Tree::search — existence check only.
CursorImpl::get_data_from_tree — re-descended to fetch data.
CursorImpl::find_bin_for_key — re-descended for BIN pinning.

One descent now does all three jobs. At the BIN level it uses the existing binary-search helper find_entry_compressed instead of the O(n) iter().find() used by get_data_from_tree.

Returns None only when the tree is empty. Otherwise returns Some(SlotFetch) — callers must inspect SlotFetch::found to determine whether the key was present. The BIN read-guard is released before this method returns so callers may safely call lock_ln (which may block) without holding any tree latch.

Wave-11-I — see the 2026 review.

Source

pub fn update_key_expiration(&self, key: &[u8], expiration_hours: u32) -> bool

Sets the expiration time (in absolute hours since Unix epoch) for an existing key’s BIN slot.

Returns true if the key was found and updated, false otherwise.

Used by Database::put_with_options() to apply per-record TTL. IN.entryExpiration / BIN.expirationInHours path.

Source

pub fn first_entry_at_or_after( &self, key: &[u8], ) -> Option<(Vec<u8>, Vec<u8>, u64)>

Returns the key and data of the first BIN entry at or after key.

Descends with the tree’s key comparator (same path as search()), then within the BIN finds the first slot whose stored key >= key using the comparator. Returns None if every entry in the tree is < key.

Used by sorted-duplicate cursor search(Set) to position at the first (key, data) pair whose two-part key >= lower_bound(primary_key).

→ BIN scan path.

Source

pub fn first_entry_at_or_after_with_index( &self, key: &[u8], ) -> Option<(Vec<u8>, Vec<u8>, usize, u64, Arc<NodeRwLock<TreeNode>>)>

Like Tree::first_entry_at_or_after but also returns the BIN node (so callers may pin it) and the entry’s slot index inside that BIN.

Wave 11-N (Bug 2): CursorImpl::search_dup previously stored current_index = 0 after a sorted-dup Search, which broke the fast-path of retrieve_next (and the slow path’s next_index = current_index + 1 arithmetic) for any primary that was not the first slot of its BIN. This helper hands back the real index so the cursor can be positioned correctly.

CC-2 fix: uses the same read_arc() hand-over-hand latch coupling as every other descent method (search, first_entry_at_or_after, get_first_node, get_adjacent_bin_attempt). The original implementation did arc.read().is_bin() (lock acquired and released) then a SECOND arc.read() on the next line — a gap in which a concurrent split can promote the node (BIN→upper IN) or move the sought key to a new sibling, yielding a false “not found” for an existing key. Mirrors JE Tree.searchSubTree / Tree.search which hold the latch across the is_bin() test and the subsequent entry lookup.

Source

pub fn insert( &self, key: Vec<u8>, data: Vec<u8>, lsn: Lsn, ) -> Result<bool, TreeError>

Insert a key/data pair into the tree.

. Handles the root-is-null case by creating a two-level tree (upper IN + BIN) per initialisation path, then delegates to insert_recursive which performs preemptive splitting as it descends.

Returns Ok(true) if this was a new insert, Ok(false) if it was an update.

Source

pub fn redo_insert( &self, key: &[u8], data: &[u8], lsn: Lsn, ) -> Result<bool, TreeError>

Recovery-redo variant of Tree::insert that accepts &[u8] slices.

Eliminates the two intermediate Vec<u8> allocations that the normal insert path requires at the redo_ln call site (one for the key, one for the data). The compressed key suffix and the data bytes are each materialised into their BinEntry slots exactly once.

Semantics are identical to insert:

Updates the existing slot when the key is already present.
Inserts a new sorted entry when the key is absent.
Triggers the same root-split and proactive-split logic.

data should be the raw value bytes, or an empty slice for a deletion (which should not normally arrive here during redo, but is handled gracefully).

Wave 11-K optimisation (Fix 1).

Source

pub fn reserve_redo_capacity(&self, n: usize)

Pre-warm the tree’s internal Vec<BinEntry> capacity before a redo pass that will insert approximately n records.

If the tree is empty, this is a no-op (there is no BIN yet to reserve capacity on). If the tree already has a root BIN (from a previous checkpoint), reserves n.min(max_entries_per_node) additional slots in that BIN’s entries vector, eliminating the resize-double cycle during the redo loop.

Wave 11-K optimisation (Fix 3).

Source

pub fn get_first_node(&self) -> Option<SearchResult>

Get the first (leftmost) BIN in the tree.

Descends to the leftmost BIN by always following the first child slot at each upper IN level.

Source

pub fn get_last_node(&self) -> Option<SearchResult>

Get the last (rightmost) BIN in the tree.

Descends to the rightmost BIN by always following the last child slot at each upper IN level.

Source

pub fn get_root_splits(&self) -> u64

Returns the number of root splits that have occurred.

Source

pub fn get_relatches_required(&self) -> u64

Returns the number of relatches required.

Source

pub fn delete(&self, key: &[u8]) -> bool

Delete a key from the tree.

Traverses the tree to find the BIN that should contain the key, then removes the entry. Returns true if the key was found and removed.

Delete path in Tree from the.

In-memory removal only — WAL logging for deletes is handled by the cursor layer (cursor_impl.rs::log_ln_write) before this is called, matching separation between LN logging and tree mutation.

Source

pub fn compress(&self)

Merge under-full sibling BIN pairs and remove empty subtrees.

INCompressor / Tree.compressInternal() logic.

merges two adjacent siblings when their combined entry count is ≤ max_entries_per_node (the merge threshold equal to the node capacity). The left sibling’s entries are prepended into the right sibling; the parent key slot pointing at the left sibling is then removed from the parent IN with deleteEntry. If the parent IN becomes empty after the removal the process repeats recursively up the tree.

This implementation performs a single post-order walk so that each level is compressed after all its children have been compressed.

Source

pub fn compress_bin(&self, bin_arc: &Arc<RwLock<TreeNode>>) -> bool

Compress deleted slots from a BIN node, then prune it from its parent IN when it becomes empty.

(the in-place slot-removal path, NOT the sibling-merge path handled by compress()).

§Algorithm

If the BIN is a delta, skip — deltas cannot be compressed.
Remove all slots where entry.known_deleted is true. This mirrors bin.compress(!bin.shouldLogDelta(), localTracker).
If the BIN is now empty, remove it from its parent IN. This mirrors pruneBIN(db, binRef, idKey) → tree.delete(idKey).

§Arguments

bin_arc — the BIN to compress (must be a TreeNode::Bottom).

§Returns

true if compression made progress (slots were removed or the BIN was pruned), false if the BIN was skipped (delta, no cursors issue, etc.).

Source

pub fn compress_bin_with_lock_check( &self, bin_arc: &Arc<RwLock<TreeNode>>, is_locked: Option<&dyn Fn(u64) -> bool>, ) -> bool

Like compress_bin, but consults a caller-supplied is_locked predicate before physically removing each known_deleted slot. If is_locked(slot_lsn) returns true, the slot is SKIPPED (left for a later compression pass after the locking txn resolves).

This is the faithful port of JE BIN.compress (BIN.java:1141-1172):

We have to be able to lock the LN before we can compress the entry. If we can’t, then skip over it. … it is more efficient to call isLockUncontended than to actually lock the LN, since we would release the lock immediately.

if (lsn != DbLsn.NULL_LSN &&
    !lockManager.isLockUncontended(lsn)) {
    anyLocked = true;
    continue;
}

JE’s isLockUncontended(lsn) (LockManager.java:692) returns nWaiters() == 0 && nOwners() == 0. Our is_locked(lsn) is its inverse: the dbi layer supplies a closure over the LockManager that returns true iff the slot’s LSN has any owner or waiter (LockManager::get_lock_info(lsn) != (0, 0)). A NULL_LSN slot is always discardable without locking (JE: “Can discard a NULL_LSN entry without locking”), so we never invoke the predicate for it.

§Layering (noxu-tree -/-> noxu-txn)

The predicate is a &dyn Fn(u64) -> bool, NOT a LockManager reference, so noxu-tree never depends on noxu-txn. The lock knowledge lives entirely in the dbi-supplied closure.

§Lock ordering (no deadlock)

is_locked is invoked while this method holds the BIN write latch. The dbi closure calls LockManager::get_lock_info, which takes a lock-table shard mutex for a single, non-blocking critical section and releases it before returning — it never waits and never re-enters the tree. The LockManager has no edge back into a BIN latch (lock acquisition descends the tree from the dbi/cursor layer, never the reverse). The only ordering is therefore BIN-latch -> shard-mutex, which is acyclic; no lock cycle exists, so the predicate cannot deadlock against the latch.

When is_locked is None (recovery, BIN-delta replay, unit tests with no lock manager) behavior is identical to the historical compress_bin: every known_deleted slot is removed.

Source

pub fn prune_empty_bin(&self, id_key: &[u8]) -> bool

Re-descend to the leaf BIN that should contain id_key and remove its parent-IN child slot ONLY IF the BIN is still safe to prune.

This is the faithful port of JE Tree.delete(idKey) / Tree.searchDeletableSubTree (Tree.java ~line 755-800) as invoked by INCompressor.pruneBIN (INCompressor.java ~line 502-510). JE takes the branch-parent latch, re-descends to the specific empty BIN, and aborts the prune (removing NOTHING) if any of the following changed since the compressor observed the BIN as empty:

bin.getNEntries() != 0 → NodeNotEmptyException (a concurrent insert repopulated the BIN — IC-1: we must NOT delete a live entry).
bin.isBINDelta() → unexpectedState (deltas are never empty).
bin.nCursors() > 0 → CursorsExistException (a cursor is parked on the empty BIN; requeue rather than orphan the cursor).

The re-check and the slot removal both happen while holding the parent IN write latch. Holding the parent write latch blocks every descender (insert / delete take parent.read() hand-over-hand), so a concurrent insert cannot reach the BIN between our re-check and the slot removal — the TOCTOU window IC-1 describes is closed.

Returns true iff a parent-IN slot was removed, false otherwise (BIN repopulated, has a cursor, is a delta, vanished, or is the root — in every false case NOTHING is removed).

Source

pub fn detach_node_by_id(&self, node_id: u64) -> u64

Detach the resident child node node_id from its parent IN, dropping the strong Arc so the node is actually freed from memory, and return the heap bytes reclaimed (0 if not found / not detachable).

This is the faithful port of JE IN.detachNode(idx, updateLsn, newLsn) (IN.java ~4019) as called from Evictor.evict (Evictor.java ~3035): evict measures target.getBudgetedMemorySize() and then parent.detachNode(index, ...) does setTarget(idx, null) to drop the child reference and getInMemoryINs().remove(child) to drop it from the INList.

EV-13: before this method existed, the evictor credited node_size_fn(node_id) bytes back to the budget and removed the node from the LRU lists, but the parent’s InEntry.child still held a strong Arc — so the node was never dropped from the heap. The budget over-credited (claimed bytes freed that were not), cache_usage drifted below reality, and the evictor under-fired. Detaching here drops the Arc for real and credits exactly the measured size.

The detach happens under the parent IN write latch (JE detaches under the parent’s latch), so no concurrent descender can re-cache the child between measurement and detach. The slot (key + LSN) is kept — only the in-memory child target is cleared — matching JE’s setTarget(idx, null) which leaves the ChildReference LSN intact so the node can be re-fetched from the log later.

Returns 0 if the node is not a resident child of any IN (e.g. it is the root, already detached, or was pinned and could not be latched).

Source

pub fn evict_root(&self, db_id: u64) -> Option<(u64, bool)>

Evict the root IN of this tree (EV-14).

Faithful port of JE Evictor.evictRoot (Evictor.java:3050-3110) plus the RootEvictor.doWork + Tree.withRootLatchedExclusive framing (Evictor.java:2529-2576, Tree.java:508-517). Unlike a normal IN, the root has no parent slot to detach from; instead the tree’s root reference is the equivalent of the RootChildReference, so eviction:

Latches the root reference exclusively (rootLatch.acquireExclusive via withRootLatchedExclusive).
Re-checks that the root is still resident and still evictable (no resident children, no pinned BIN — JE RootEvictor.doWork re-latches and re-checks rootIN == target && rootIN.isRoot()).
If the root is dirty, LOGS it first so the on-disk version is current and updates root_log_lsn to the new LSN (JE evictRoot: long newLsn = target.log(...); rootRef.setLsn(newLsn)).
Clears the in-memory root (rootRef.clearTarget() — JE leaves the ChildReference LSN intact; here root_log_lsn is that LSN) and note_removeds it from the evictor LRU (JE inList.remove(target)).

On the next access fetch_root_from_log re-materializes the root from root_log_lsn (JE Tree.getRootINRootAlreadyLatched → root.fetchTarget).

§Conditions (eviction is REFUSED, returning `None`, when)

there is no log manager wired (the root could never be re-fetched),
the tree has no resident root (already evicted),
the root has any resident child (JE only evicts a childless root — the hasCachedChildren skip in processTarget; a root with cached children would orphan them, the EV-6 invariant),
the root is a BIN pinned by a cursor (cursor_count > 0),
the root is dirty but we have no clean persisted version AND logging it fails, or
the root is clean but root_log_lsn is null (never logged — cannot be re-fetched; happens only for a brand-new unlogged tree).

Returns Some((freed_bytes, was_dirty)) on success, where freed_bytes is the root’s measured heap footprint (JE target.getBudgetedMemorySize()) and was_dirty reports whether the root had to be logged (JE rootEvictor.flushed, which drives nDirtyNodesEvicted and modifyDbRoot).

Source

pub fn fetch_root_from_log(&self) -> Option<Arc<RwLock<TreeNode>>>

Re-materialize an evicted root IN from its persisted root_log_lsn (EV-14, piece B). Faithful to JE Tree.getRootINRootAlreadyLatched (Tree.java:477-516) which calls root.fetchTarget(database, null) when the in-memory target is null. Idempotent and cheap when the root is already resident: returns the resident root without touching the log.

Returns None only when the tree is genuinely empty (no resident root AND root_log_lsn is null) or when the re-fetch fails (no log manager, log read error, deserialize failure) — callers then see an empty tree, never wrong data.

Source

pub fn maybe_compress_bin_and_parent( &self, bin_arc: &Arc<RwLock<TreeNode>>, ) -> bool

Check whether a BIN node is a candidate for slot compression and, if so, trigger compress_bin.

from (the opportunistic / lazy compression path).

§Algorithm

Skip the BIN if it is a delta or has no defunct (known-deleted) slots.
If compression succeeds and the BIN becomes empty, it is pruned.

§Returns

true if compression was triggered (regardless of whether any slots were actually removed), false if the BIN does not need compression.

Source

pub fn validate_parent_child( parent: &Arc<RwLock<TreeNode>>, child_index: usize, child_arc: &Arc<RwLock<TreeNode>>, ) -> bool

Validate that parent.entries[child_index].child still points at child_arc after acquiring the child’s latch.

Re-latch validation step inside the Tree.searchSplitsAllowed: after a concurrent split the parent slot that previously held the child may have changed. Callers that plan to mutate the child must verify the parent-child link is still intact before proceeding.

Returns true if the parent-child link is intact.

Source

pub fn search_with_coupling(&self, key: &[u8]) -> Option<SearchResult>

Search for the BIN that should contain key, with latch-coupling validation at every level of descent.

.

The difference from search() is that after obtaining the child arc we call validate_parent_child to confirm the parent still holds the expected Arc. If the link has been broken (e.g. by a concurrent split that relocated the child) the traversal restarts from the root.

Returns a SearchResult if the key is (or should be) in the tree, None if the tree is empty.

Same as Tree::search but exposes the hand-over-hand latch coupling explicitly. Kept as a public, equivalent API for callers (today only tests) that want to verify the latch-coupling behaviour against search() itself.

Both search() and this method use the same read_arc() hand-over-hand: take the child read guard before dropping the parent guard, so a concurrent split_child(parent, ..) (which takes parent.write()) cannot run between when we captured the child Arc and when we entered the child. There is no validate-and-restart loop because the coupling makes the race unreachable.

Source

pub fn pin_bin(bin_arc: &Arc<RwLock<TreeNode>>)

Increments the cursor-pin count on a BIN node.

Called by CursorImpl when it positions on (or enters) a BIN. The evictor will not select a BIN with cursor_count > 0 for eviction (RealNodeInfo.pin_count), matching BIN.incrementCursorCount().

Source

pub fn unpin_bin(bin_arc: &Arc<RwLock<TreeNode>>)

Decrements the cursor-pin count on a BIN node.

Called by CursorImpl when it moves away from or closes on a BIN. Uses saturating_sub to guard against an accidental double-unpin. Matching BIN.decrementCursorCount().

Source

pub fn bin_is_delta(bin: &BinStub) -> bool

Returns true if the given BinStub is a BIN-delta (not a full BIN).

IN.isBINDelta().

Source

pub fn apply_delta_to_bin( bin: &mut BinStub, delta_entries: Vec<(Vec<u8>, Lsn, Option<Vec<u8>>)>, )

Merge delta entries into a full BIN’s entry list.

For each delta entry: if a matching key already exists in bin, replace it (delta is authoritative).
Otherwise insert the delta entry in sorted position.

Delta entries carry full keys (prefix already prepended by the caller). After applying all delta entries the BIN’s prefix is recomputed so the final state is consistent.

All delta entries are considered to be the most-recently-dirtied state, exactly as in where delta slots supersede full-BIN slots.

Source

pub fn mutate_to_full_bin(delta: &mut BinStub, base: BinStub)

Reconstitute a BIN-delta into a full BIN.

from the:

Extract the delta entries from self (this BIN-delta), decompressing them to full keys.
Apply them onto base (the previously logged full BIN) via apply_delta_to_bin.
Copy base’s merged entries and prefix back into self.
Clear the is_delta flag so subsequent code treats self as a full BIN.

After this call self is a full BIN; base should be discarded.

Source

pub fn mutate_to_full_bin_from_log( delta: &mut BinStub, log_manager: &LogManager, )

Reconstitute a BIN-delta into a full BIN by reading the base from log.

— the single-argument overload that calls fetchFullBIN(databaseImpl) to read the last full BIN from the log manager automatically.

Algorithm:

If delta.last_full_lsn == NULL_LSN, the BIN was never written as a full entry; there is no base to merge so the delta IS the full BIN. Clear is_delta and return.
Read the full-BIN log entry at delta.last_full_lsn using log_manager.read_entry(lsn).
Deserialize the payload with BinStub::deserialize_full().
Delegate to Self::mutate_to_full_bin(delta, base) to merge and replace delta’s contents.

On any read / parse failure the function falls back to clearing the is_delta flag without merging, so the caller always gets a non-delta BIN (possibly missing some old slots). This mirrors the EnvironmentFailureException path but gracefully degrades instead of panicking.

BIN.fetchFullBIN(dbImpl) + BIN.mutateToFullBIN(boolean).

Source

pub fn get_next_bin( &self, current_key: &[u8], ) -> Option<Vec<(BinEntry, Lsn, Vec<u8>)>>

Return the entries of the BIN immediately to the right of the BIN that contains (or would contain) current_key.

→ Tree.getNextIN(forward=true).

§Algorithm

Build a root-to-BIN path for current_key.
Walk the path back up looking for a parent that has a slot to the right of the slot we descended through.
When found, descend to the leftmost BIN of that sibling subtree.
If no such parent exists, return None (no next BIN).

Source

pub fn get_prev_bin( &self, current_key: &[u8], ) -> Option<Vec<(BinEntry, Lsn, Vec<u8>)>>

Return the entries of the BIN immediately to the left of the BIN that contains (or would contain) current_key.

→ Tree.getNextIN(forward=false).

Source §

impl Tree

Source

pub fn collect_stats(&self) -> TreeStats

Walks the entire tree and collects structural statistics.

TreeWalkerStatsAccumulator pattern — performs a simple recursive DFS and counts INs, BINs, entries, and tree height.

Source

pub fn collect_dirty_bins( &self, db_id: u64, ) -> Vec<(u64, Arc<RwLock<TreeNode>>)>

Collects all dirty BINs as (Arc to node, db_id) pairs.

The checkpoint path calls this to enumerate BINs that need to be logged. For each dirty BIN the checkpoint decides — based on the BIN-delta threshold — whether to write a full BIN entry or a BINDelta entry.

Checkpointer.processINList() which iterates the dirty IN list accumulated during normal operation.

Source

pub fn collect_bins_with_known_deleted(&self) -> Vec<Arc<RwLock<TreeNode>>>

Collect all BINs that have at least one known_deleted slot.

INCompressor queue-drain scan in the: the daemon iterates the in-memory IN list and identifies BINs that still hold zombie deleted slots. Each returned Arc can be passed directly to compress_bin().

Source

pub fn serialize_upper_in(&self, node_id: u64) -> Option<Vec<u8>>

Collect all dirty upper (non-BIN) internal nodes, sorted ascending by level (bottom-up order, BIN level excluded).

Serialise an upper-IN node (level > 1) by node_id for off-heap storage.

Traverses the tree to find the internal node whose matches, then calls to produce a compact byte representation. Returns if the node is not found or is a BIN (BINs are not upper INs).

Mirrors OffHeapAllocator serialises the same bytes that would be written to the log, allowing the evictor to store upper-INs off-heap and avoid log-file reads on the next traversal.

Source

pub fn collect_dirty_upper_ins( &self, _db_id: u64, ) -> Vec<(i32, Arc<RwLock<TreeNode>>)>

Upper-IN traversal in Checkpointer.processINList() from — visits all TreeNode::Internal nodes whose dirty flag is set and returns them together with their level, sorted lowest-level-first so the checkpointer can log them bottom-up. The root is always the last entry (highest level), which must be logged Provisional::No.

Source

pub fn is_root_resident(&self) -> bool

Returns true if the root node is currently loaded in memory.

.

Source

pub fn get_resident_root_in(&self) -> Option<Arc<RwLock<TreeNode>>>

Returns the root node Arc if present, or None.

.

Source

pub fn get_parent_bin_for_child_ln( &self, key: &[u8], ) -> Option<Arc<RwLock<TreeNode>>>

Returns the BIN that should contain a slot for key (the “parent” of LN slots).

. Descends the tree exactly like search() and returns the leaf-level BIN arc, or None if the tree is empty.

Uses read_arc() hand-over-hand on the descent — the child guard is taken before the parent guard is dropped, matching search(). Returns the BIN Arc with no read lock held; the caller must take whatever lock it needs to operate on the returned BIN.

Source

pub fn find_bin_for_insert(&self, key: &[u8]) -> Option<Arc<RwLock<TreeNode>>>

Returns the BIN where key should be inserted.

. Semantically identical to get_parent_bin_for_child_ln — expressed as a separate method to match API surface.

Implemented as a delegation to get_parent_bin_for_child_ln, which uses read_arc() hand-over-hand on the descent.

Source

pub fn search_splits_allowed(&self, key: &[u8]) -> Option<SearchResult>

Search for a BIN, allowing splits during descent (preemptive splitting).

. This thin wrapper delegates to search() and returns the result wrapped in Some. The full split-allowed descent is performed by insert() internally; this method exposes the same result type for callers that only need to locate the BIN.

Returns None if the tree is empty.

Source

pub fn rebuild_in_list(&self) -> Vec<Arc<RwLock<TreeNode>>>

Traverses the entire tree and returns every IN and BIN node as a flat list.

. Used by recovery to rebuild the in-memory IN list after log replay. The walk is a BFS from the root; every Arc<RwLock<TreeNode>> encountered (both Internal and Bottom variants) is included in the result.

Source

pub fn validate_in_list(&self) -> bool

Validates internal tree consistency.

. Primarily a debug/test tool.

Rules checked:

An empty tree (no root) is trivially valid → returns true.
A non-empty tree must have a non-null root.
Every Internal node must have at least one entry.
Every child pointer that is Some must be readable (lock must be acquirable — i.e., no poisoned locks).

Returns true if no inconsistencies are detected, false otherwise.

Source

pub fn get_parent_in_for_child_in( &self, child_node_id: u64, ) -> Option<(Arc<RwLock<TreeNode>>, usize)>

Traverses the tree to find the parent IN that contains child_node_id as one of its child slots.

. Used by the cleaner migration path to re-insert migrated INs after eviction/fetch.

Returns (parent_arc, slot_index) where slot_index is the position in the parent’s entries vector whose child matches child_node_id, or None if no such parent is found.

Source

pub fn reconstitute_bin_delta( base_bytes: &[u8], delta_bytes: &[u8], ) -> Option<BinStub>

Propagates the dirty flag upward from node_arc to the root.

Implicit dirty propagation: after modifying any node, all ancestors on the path to the root must also be marked dirty so the checkpointer logs them.

In this happens through IN.setDirty(true) calls at each level during split/insert callbacks. Here we walk the weak parent chain. Reconstitute a BIN-delta by merging it onto a base full BIN.

Implements JE BINDelta.reconstituteBIN(databaseImpl) for the recovery path where the log manager is not available as a LogManager but as raw serialized bytes.

Algorithm:

Deserialise base_bytes as a full BinStub.
Apply delta_bytes slots onto the base using BinStub::apply_delta (raw slot overlay).
Recompute key prefix so prefix-compressed entries are consistent.

Returns None if either byte slice is malformed.

JE BINDelta.reconstituteBIN / BINDelta.applyDelta (DRIFT-10 / Stage 3).

Source

pub fn propagate_dirty_to_root(node_arc: &Arc<RwLock<TreeNode>>)

Source

pub fn deserialize_upper_in(bytes: &[u8]) -> Option<InNodeStub>

Deserialise an upper-IN node from bytes produced by TreeNode::write_to_bytes() / flush_one_tree_upper_ins.

JE INFileReader.getIN(db) / IN.readFromLog.

Source

pub fn deserialize_bin(bytes: &[u8]) -> Option<BinStub>

Deserialise a BIN from bytes produced by BinStub::serialize_full().

Thin wrapper so the recovery path does not need to import BinStub directly from callers that only have the raw bytes.

JE INFileReader.getIN(db) for a BIN entry.

Source

pub fn recover_in_redo( &self, log_lsn: Lsn, is_root: bool, is_bin: bool, node_data: &[u8], ) -> InRedoResult

Apply a logged IN/BIN to the in-memory tree during the recovery redo pass.

Implements JE RecoveryManager.recoverIN:

is_root nodes are handled by recover_root_in.
non-root nodes are handled by recover_child_in.

log_lsn is the LSN at which this IN/BIN was logged. The currency check in recover_child_in uses this to decide whether to replace the in-memory slot (tree slot LSN < log_lsn → replace; equal → noop; greater → skip).

JE RecoveryManager.recoverIN / replayOneIN (RecoveryManager.java ~lines 1200–1280).

Auto Trait Implementations§

§

impl UnsafeUnpin for Tree

Blanket Implementations§

Source §

impl<T> Any for T
where T: 'static + ?Sized,

Source §

fn type_id(&self) -> TypeId

Gets the TypeId of self. Read more

Source §

impl<T> Borrow<T> for T
where T: ?Sized,

Source §

fn borrow(&self) -> &T

Immutably borrows from an owned value. Read more

Source §

impl<T> BorrowMut<T> for T
where T: ?Sized,

Source §

fn borrow_mut(&mut self) -> &mut T

Mutably borrows from an owned value. Read more

Source §

impl<T> From<T> for T

Source §

fn from(t: T) -> T

Returns the argument unchanged.

Source §

impl<T, U> Into for T
where U: From<T>,

Source §

fn into(self) -> U

Calls U::from(self).

That is, this conversion is whatever the implementation of From<T> for U chooses to do.

Source §

impl<T, U> TryFrom for T
where U: Into<T>,

Source §

type Error = Infallible

The type returned in the event of a conversion error.

Source §

fn try_from(value: U) -> Result<T, <T as TryFrom>::Error>

Performs the conversion.

Source §

impl<T, U> TryInto for T
where U: TryFrom<T>,

Source §

type Error = >::Error

The type returned in the event of a conversion error.

Source §

fn try_into(self) -> Result<U, >::Error>

Performs the conversion.

Struct Tree Copy item path

Fields§

Implementations§

impl Tree

pub fn new(database_id: u64, max_entries_per_node: usize) -> Self

pub fn set_memory_counter(&mut self, counter: Arc<AtomicI64>)

pub fn set_in_list_listener(&mut self, listener: Arc<dyn InListListener>)

pub fn set_log_manager(&mut self, lm: Arc<LogManager>)

pub fn clear_log_manager(&mut self)

pub fn set_compact_max_key_length(&mut self, len: i32)

pub fn new_with_comparator( database_id: u64, max_entries_per_node: usize, comparator: KeyComparatorFn, ) -> Self

pub fn set_key_prefixing(&mut self, enabled: bool)

pub fn set_comparator(&mut self, comparator: KeyComparatorFn)

pub fn hint_redo_capacity(&mut self, capacity: usize)

pub fn get_redo_capacity_hint(&self) -> usize

pub fn take_comparator(&mut self) -> Option<KeyComparatorFn>

pub fn get_comparator(&self) -> Option<&KeyComparatorFn>

pub fn is_empty(&self) -> bool

pub fn set_root(&self, node: TreeNode)

pub fn get_root(&self) -> Option<Arc<RwLock<TreeNode>>>

pub fn get_database_id(&self) -> u64

pub fn count_entries(&self) -> u64

pub fn resort_under_comparator(&self)

pub fn total_budgeted_memory(&self) -> u64

pub fn search(&self, key: &[u8]) -> Option<SearchResult>

pub fn search_with_data(&self, key: &[u8]) -> Option<SlotFetch>

pub fn update_key_expiration(&self, key: &[u8], expiration_hours: u32) -> bool

pub fn first_entry_at_or_after( &self, key: &[u8], ) -> Option<(Vec<u8>, Vec<u8>, u64)>

pub fn first_entry_at_or_after_with_index( &self, key: &[u8], ) -> Option<(Vec<u8>, Vec<u8>, usize, u64, Arc<NodeRwLock<TreeNode>>)>

pub fn insert( &self, key: Vec<u8>, data: Vec<u8>, lsn: Lsn, ) -> Result<bool, TreeError>

pub fn redo_insert( &self, key: &[u8], data: &[u8], lsn: Lsn, ) -> Result<bool, TreeError>

pub fn reserve_redo_capacity(&self, n: usize)

pub fn get_first_node(&self) -> Option<SearchResult>

pub fn get_last_node(&self) -> Option<SearchResult>

pub fn get_root_splits(&self) -> u64

pub fn get_relatches_required(&self) -> u64

pub fn delete(&self, key: &[u8]) -> bool

pub fn compress(&self)

pub fn compress_bin(&self, bin_arc: &Arc<RwLock<TreeNode>>) -> bool

§Algorithm

§Arguments

§Returns

pub fn compress_bin_with_lock_check( &self, bin_arc: &Arc<RwLock<TreeNode>>, is_locked: Option<&dyn Fn(u64) -> bool>, ) -> bool

§Layering (noxu-tree -/-> noxu-txn)

§Lock ordering (no deadlock)

pub fn prune_empty_bin(&self, id_key: &[u8]) -> bool

pub fn detach_node_by_id(&self, node_id: u64) -> u64

pub fn evict_root(&self, db_id: u64) -> Option<(u64, bool)>

§Conditions (eviction is REFUSED, returning None, when)

pub fn fetch_root_from_log(&self) -> Option<Arc<RwLock<TreeNode>>>

pub fn maybe_compress_bin_and_parent( &self, bin_arc: &Arc<RwLock<TreeNode>>, ) -> bool

§Algorithm

§Returns

pub fn validate_parent_child( parent: &Arc<RwLock<TreeNode>>, child_index: usize, child_arc: &Arc<RwLock<TreeNode>>, ) -> bool

pub fn search_with_coupling(&self, key: &[u8]) -> Option<SearchResult>

pub fn pin_bin(bin_arc: &Arc<RwLock<TreeNode>>)

pub fn unpin_bin(bin_arc: &Arc<RwLock<TreeNode>>)

pub fn bin_is_delta(bin: &BinStub) -> bool

pub fn apply_delta_to_bin( bin: &mut BinStub, delta_entries: Vec<(Vec<u8>, Lsn, Option<Vec<u8>>)>, )

pub fn mutate_to_full_bin(delta: &mut BinStub, base: BinStub)

pub fn mutate_to_full_bin_from_log( delta: &mut BinStub, log_manager: &LogManager, )

pub fn get_next_bin( &self, current_key: &[u8], ) -> Option<Vec<(BinEntry, Lsn, Vec<u8>)>>

§Algorithm

pub fn get_prev_bin( &self, current_key: &[u8], ) -> Option<Vec<(BinEntry, Lsn, Vec<u8>)>>

impl Tree

pub fn collect_stats(&self) -> TreeStats

pub fn collect_dirty_bins( &self, db_id: u64, ) -> Vec<(u64, Arc<RwLock<TreeNode>>)>

pub fn collect_bins_with_known_deleted(&self) -> Vec<Arc<RwLock<TreeNode>>>

pub fn serialize_upper_in(&self, node_id: u64) -> Option<Vec<u8>>

pub fn collect_dirty_upper_ins( &self, _db_id: u64, ) -> Vec<(i32, Arc<RwLock<TreeNode>>)>

pub fn is_root_resident(&self) -> bool

pub fn get_resident_root_in(&self) -> Option<Arc<RwLock<TreeNode>>>

pub fn get_parent_bin_for_child_ln( &self, key: &[u8], ) -> Option<Arc<RwLock<TreeNode>>>

pub fn find_bin_for_insert(&self, key: &[u8]) -> Option<Arc<RwLock<TreeNode>>>

pub fn search_splits_allowed(&self, key: &[u8]) -> Option<SearchResult>

pub fn rebuild_in_list(&self) -> Vec<Arc<RwLock<TreeNode>>>

pub fn validate_in_list(&self) -> bool

pub fn get_parent_in_for_child_in( &self, child_node_id: u64, ) -> Option<(Arc<RwLock<TreeNode>>, usize)>

pub fn reconstitute_bin_delta( base_bytes: &[u8], delta_bytes: &[u8], ) -> Option<BinStub>

pub fn propagate_dirty_to_root(node_arc: &Arc<RwLock<TreeNode>>)

Struct Tree

§Conditions (eviction is REFUSED, returning `None`, when)

impl<T> Any for T
where T: 'static + ?Sized,

impl<T> Borrow<T> for T
where T: ?Sized,

impl<T> BorrowMut<T> for T
where T: ?Sized,

impl<T, U> Into<U> for T
where U: From<T>,

impl<T, U> TryFrom<U> for T
where U: Into<T>,

impl<T, U> TryInto<U> for T
where U: TryFrom<T>,