Expand description
Per-subgraph union-find registry (Slice X5, 2026-05-08).
Direct port of graphrefly-py’s
subgraph_locks.py
(locked design via SESSION-rust-port-d3-per-subgraph-parallelism.md
— Q1 = (c-uf split-eager); see decision log D086).
§Concept
Each registered node is a member of exactly one connected component (a “subgraph”) at any moment. Two nodes are in the same subgraph iff they’re connected via dep edges (transitively).
Connected-component membership is tracked via union-find with
union-by-rank + path compression. Each component’s root carries
an Arc<SubgraphLockBox> that owns the partition’s
wave_owner re-entrant mutex — nodes in the same component share
one lock, disjoint components run truly parallel.
§Lifecycle hooks
SubgraphRegistry::ensure_registered— called when a node is first registered viacrate::Core::register. Allocates a fresh singleton component for the new node.SubgraphRegistry::union_nodes— called for each new dep edge (inregisterandset_deps‘s add-edge path). Merges the two endpoints’ components.SubgraphRegistry::cleanup_node— wired but not yet called in X5. In Y1 the wave engine will invoke it from sites that actually remove aNodeRecordfromCoreState.nodes(todayterminate_nodeonly marks the node terminal —NodeRecords persist for the life ofCoreState). The registry’s HashMap entries drop together withCoreStatewhen the lastCoreclone goes, so X5 doesn’t leak partitions in practice; the per-node cleanup hook becomes load-bearing once Y1 lands aDrop-via-removal lifecycle forNodeRecord(post-Y1 work).SubgraphRegistry::on_edge_removed— called for each removed dep edge inset_deps’s remove path. Slice X5 commit-1 just notes the removal; Y1 (split-eager) adds the reachability walk that splits disconnected components.
§Lock-acquisition discipline (Y1)
SubgraphRegistry::lock_for returns the component’s
Arc<SubgraphLockBox> — caller acquires box.wave_owner and
re-validates that the resolved root hasn’t been redirected by a
concurrent union (mirrors py’s MAX_LOCK_RETRIES retry loop).
§Slice X5 scope
Substrate types + registry tracking are wired to Core::register
and Core::set_deps’s edge add path so the union-find state is
maintained as nodes register and topology changes. The wave engine
itself still uses the legacy Core-level wave_owner — Y1 (the
wave-engine migration to per-partition wave_owner) is carried
forward as an explicit follow-on slice given its scope (every
begin_batch / run_wave call site + Core::subscribe lock
acquisition + BatchGuard lock retention + retry-validate
semantics for held-Arc-vs-current-root divergence on union).
Several substrate methods (cleanup_node, lock_for,
lock_for_validate) and the SubgraphLockBox::wave_owner field
are wired but unused in X5; they’re annotated per-item with
#[allow(dead_code)] until Y1 activates the wave engine through
them. (Per-item rather than module-level shotgun so any genuinely
unused new item still flags as dead code during X5 follow-up
review.)
Structs§
- Subgraph
Id - Newtype identifier for a connected-component partition.
- Subgraph
Lock Box - Per-component lock box. Holds the partition’s
wave_ownerre-entrant mutex. Onunion, the box of the smaller-rank root is replaced by the larger-rank root’s box — the lock identity is preserved across merges via theArcreference (mirrors py’s_LockBox.lockredirect on union). - Subgraph
Registry - Union-find registry tracking each node’s connected-component
membership. Wrapped by
crate::CoreinArc<Mutex<...>>so the wave engine can resolve a node’s partition before acquiring that partition’swave_owner.