Struct CsrIndex

Source

pub struct CsrIndex { /* private fields */ }

Expand description

Dense integer CSR adjacency index with interned node IDs and labels.

Implementations§

Source §

impl CsrIndex

Source

pub fn compact(&mut self) -> Result<(), GraphError>

Merge the mutable buffer into the dense CSR arrays.

Called during idle periods. Rebuilds the contiguous offset/target/label (and weight) arrays from scratch (buffer + surviving dense edges). The old arrays are dropped, freeing memory. O(E) where E = total edges.

§Errors

Returns GraphError::MemoryBudget if a memory governor is installed and the dense-array allocation would exceed the Graph engine budget.

Source §

impl CsrIndex

Source

pub fn set_node_surrogate(&mut self, node: &str, surrogate: Surrogate)

Set the surrogate for a node, identified by its user-visible name.

Call once per node after add_edge / add_edge_weighted has allocated a CSR-local id for it. If the node is not yet interned this is a no-op (the edge insertion that follows will intern it and a subsequent call with the correct surrogate will set it). Calling with Surrogate::ZERO is a no-op — the zero sentinel is the initial state and has no meaning.

Source

pub fn node_surrogate_raw(&self, local_id: u32) -> u32

Return the raw surrogate u32 for a CSR-local node id.

Returns 0 (the ZERO sentinel) when the node has no surrogate set.

Source

pub fn node_id_for_surrogate(&self, surrogate: Surrogate) -> Option<&str>

Look up the user-visible node name bound to a Surrogate.

Returns None for Surrogate::ZERO and for surrogates that have never been bound to a node in this partition. Used by cross-engine fusion (graph RAG) to translate a vector-side surrogate into a graph BFS seed name.

Source

pub fn add_node_label( &mut self, node: &str, label: &str, ) -> Result<bool, GraphError>

Add a label to a node.

Returns Ok(false) if the 64 distinct node-label limit is hit (the label is silently ignored). Returns Err(GraphError::NodeOverflow) if the node is new and the partition’s node-id space is exhausted.

Source

pub fn remove_node_label(&mut self, node: &str, label: &str)

Remove a label from a node.

Source

pub fn node_has_label(&self, node_id: u32, label: &str) -> bool

Check if a node has a specific label.

Source

pub fn node_labels(&self, node_id: u32) -> Vec<&str>

Get all label names for a node.

Source §

impl CsrIndex

Source

pub fn partition_tag(&self) -> u32

Partition tag assigned at construction. Embedded in every LocalNodeId this index produces.

Source

pub fn local(&self, id: u32) -> LocalNodeId

Mint a LocalNodeId for this partition from a raw dense index. Used by algorithm code that iterates 0..node_count and needs to call LocalNodeId-taking APIs.

Source

pub fn neighbors( &self, node: &str, label_filter: Option<&str>, direction: Direction, ) -> Vec<(String, String)>

Get immediate neighbors by string name.

Source

pub fn neighbors_multi( &self, node: &str, label_filters: &[&str], direction: Direction, ) -> Vec<(String, String)>

Get neighbors with multi-label filter. Empty labels = all edges.

Source

pub fn add_node(&mut self, name: &str) -> Result<LocalNodeId, GraphError>

Add a node without any edges. Idempotent — returns the existing tagged id if the name is already present.

Returns Err(GraphError::NodeOverflow) when the partition’s node-id space is exhausted (more than MAX_NODES_PER_CSR distinct nodes).

Source

pub fn node_count(&self) -> usize

Source

pub fn contains_node(&self, node: &str) -> bool

Source

pub fn node_name(&self, id: LocalNodeId) -> &str

Get the string name for a tagged node id.

Source

pub fn node_id(&self, name: &str) -> Option<LocalNodeId>

Look up the tagged node id for a string name.

Source

pub fn label_name(&self, label_id: u32) -> &str

Get the string label for a dense label index.

Source

pub fn label_id(&self, name: &str) -> Option<u32>

Look up the dense label id for a string label.

Source

pub fn out_degree(&self, id: LocalNodeId) -> usize

Out-degree of a node (including buffer, excluding deleted).

Source

pub fn in_degree(&self, id: LocalNodeId) -> usize

In-degree of a node.

Source

pub fn edge_count(&self) -> usize

Total edge count (dense + buffer - deleted). O(V).

Source

pub fn iter_out_edges( &self, node: LocalNodeId, ) -> impl Iterator<Item = (u32, LocalNodeId)> + '_

Iterate all outbound edges for a tagged node. Yields (label_id, dst) with dst tagged to this partition.

Source

pub fn iter_in_edges( &self, node: LocalNodeId, ) -> impl Iterator<Item = (u32, LocalNodeId)> + '_

Iterate all inbound edges for a tagged node.

Source

pub fn iter_out_edges_raw( &self, node: u32, ) -> impl Iterator<Item = (u32, u32)> + '_

Raw dense out-edges iteration. In-partition algorithm use only.

Source

pub fn iter_in_edges_raw( &self, node: u32, ) -> impl Iterator<Item = (u32, u32)> + '_

Raw dense in-edges iteration. In-partition algorithm use only.

Source

pub fn out_degree_raw(&self, node: u32) -> usize

Raw out-degree by dense index.

Source

pub fn in_degree_raw(&self, node: u32) -> usize

Raw in-degree by dense index.

Source

pub fn node_name_raw(&self, id: u32) -> &str

String name for a raw dense index. In-partition algorithm use only.

Source

pub fn node_id_raw(&self, name: &str) -> Option<u32>

Raw dense index lookup by name. In-partition algorithm use only.

Source §

impl CsrIndex

Source

pub fn add_edge( &mut self, src: &str, label: &str, dst: &str, ) -> Result<(), GraphError>

Incrementally add an unweighted edge (goes into mutable buffer). Uses weight 1.0 if the graph already has weighted edges.

Returns Err(GraphError::LabelOverflow) if the label id space is exhausted. Production callers should always surface this to the client; silently ignoring it reproduces the silent-wrap bug the u32 widening was meant to fix.

Source

pub fn add_edge_weighted( &mut self, src: &str, label: &str, dst: &str, weight: f64, ) -> Result<(), GraphError>

Incrementally add a weighted edge (goes into mutable buffer).

If this is the first weighted edge (weight != 1.0), initializes the weight tracking infrastructure (backfills existing buffer entries with 1.0).

Source

pub fn remove_edge(&mut self, src: &str, label: &str, dst: &str)

Incrementally remove an edge.

Source

pub fn remove_node_edges(&mut self, node: &str) -> usize

Remove ALL edges touching a node. Returns the number of edges removed.

Source

pub fn remove_nodes_with_prefix(&mut self, prefix: &str)

Remove all edges touching any node whose ID starts with prefix.

Used for tenant purge: prefix = "{tenant_id}:" removes all edges belonging to that tenant.

Source §

impl CsrIndex

Source

pub fn new() -> Self

Source

pub fn with_governor(governor: Arc<MemoryGovernor>) -> Self

Create a new CsrIndex wired to a memory governor.

Subsequent calls to compact(), checkpoint_to_bytes(), and compute_statistics() will reserve bytes against EngineId::Graph before allocating and return Err(GraphError::MemoryBudget(_)) if the budget is exhausted.

Use CsrIndex::new() when deploying without a governor (NodeDB-Lite, WASM, or tests that do not need budget enforcement).

Source

pub fn with_governor_attached(self, governor: Arc<MemoryGovernor>) -> Self

Attach a memory governor to an existing CsrIndex.

Used by the REINDEX path: a partition is rebuilt on a background thread (without a governor since MemoryGovernor is Arc<...> but the thread is independent), and on cutover the Data Plane installs the governor.

Source

pub fn has_weighted_edges(&self) -> bool

Whether any edge in this index has a non-default (1.0) weight.

Source §

impl CsrIndex

Source

pub fn record_access(&self, node_id: u32)

Record a node access (called during traversals for hot/cold tracking).

Source

pub fn cold_nodes(&self, threshold: u32) -> Vec<u32>

Get nodes with access count below threshold (cold nodes).

Source

pub fn hot_node_count(&self) -> usize

Number of hot nodes (access count > 0).

Source

pub fn query_epoch(&self) -> u64

Current query epoch (incremented per traversal call).

Source

pub fn reset_access_counts(&mut self)

Reset access counters (called during compaction or periodically).

Source

pub fn prefetch_node(&self, node_id: u32)

Predictive prefetch: hint the OS to load a node’s adjacency data into the page cache before the traversal touches it.

For in-memory dense CSR, this prefetches the cache line containing the node’s offset/target entries. For future mmap’d cold segments, this would call madvise(MADV_WILLNEED) on the relevant pages.

Called during BFS planning: when adding nodes to the next frontier, prefetch their neighbors’ data so it’s resident when the BFS loop reaches them on the next iteration.

Source

pub fn prefetch_batch(&self, node_ids: &[u32])

Prefetch a batch of nodes (called during BFS frontier expansion).

Source

pub fn evaluate_memory_pressure( &self, utilization: u8, spill_threshold: u8, restore_threshold: u8, ) -> (usize, usize)

Evaluate graph memory pressure and return promotion/demotion hints.

Uses SpillController-compatible thresholds (90%/75% hysteresis):

Above spill threshold: demote cold nodes (spill to potential mmap)
Below restore threshold: promote warm nodes back to hot RAM

utilization = estimated memory usage as percentage (0-100). Returns (nodes_to_demote, nodes_to_promote) counts.

Source

pub fn estimated_memory_bytes(&self) -> usize

Estimated memory usage of the dense CSR in bytes.

Used for SpillController utilization calculation.

Source §

impl CsrIndex

Source

pub fn checkpoint_to_bytes(&self) -> Result<Vec<u8>, GraphError>

Serialize the index to rkyv bytes (with magic header) for storage.

§Errors

Returns GraphError::MemoryBudget if a memory governor is installed and the serialization buffer would exceed the Graph engine budget.

Source

pub fn from_checkpoint(bytes: &[u8]) -> Result<Option<Self>, CsrCheckpointError>

Restore an index from a checkpoint snapshot.

Returns:

Ok(Some(index)) — successfully decoded.
Ok(None) — buffer does not start with the magic header (no legacy format exists for CSR; callers should treat this as an invalid buffer).
Err(CsrCheckpointError::UnsupportedVersion) — magic matches but the version byte is not CSR_FORMAT_VERSION.

Source §

impl CsrIndex

Source

pub fn node_to_id_map(&self) -> &HashMap<String, u32>

Node-to-ID mapping (for snapshot cloning).

Source

pub fn id_to_node_list(&self) -> &[String]

ID-to-node list (for snapshot cloning).

Source

pub fn label_to_id_map(&self) -> &HashMap<String, u32>

Label-to-ID mapping (for snapshot cloning).

Source

pub fn id_to_label_list(&self) -> &[String]

ID-to-label list (for snapshot cloning).

Source

pub fn out_offsets_slice(&self) -> &[u32]

Outbound offset array slice.

Source

pub fn out_targets_slice(&self) -> &[u32]

Outbound target array slice.

Source

pub fn out_labels_slice(&self) -> &[u32]

Outbound label array slice.

Source

pub fn out_weights_slice(&self) -> Option<&[f64]>

Outbound weight array slice (None if unweighted).

Source

pub fn in_offsets_slice(&self) -> &[u32]

Inbound offset array slice.

Source

pub fn in_targets_slice(&self) -> &[u32]

Inbound target array slice.

Source

pub fn in_labels_slice(&self) -> &[u32]

Inbound label array slice.

Source

pub fn in_weights_slice(&self) -> Option<&[f64]>

Inbound weight array slice (None if unweighted).

Source §

impl CsrIndex

Source

pub fn compute_statistics(&self) -> Result<GraphStatistics, GraphError>

Compute full graph statistics from the current CSR state.

O(V + E) — iterates all nodes and edges once. Intended for query planning, not hot-path execution.

§Errors

Returns GraphError::MemoryBudget if a memory governor is installed and the degree-array working set would exceed the Graph engine budget.

Source

pub fn label_edge_count(&self, label: &str) -> usize

Get the edge count for a specific label. O(E) unless cached.

Returns 0 if the label doesn’t exist.

Source

pub fn label_selectivity(&self, label: &str) -> f64

Estimate the selectivity of a label: edge_count / total_edges.

Returns 1.0 for unknown labels (conservative — assume all edges). Returns 0.0 for graphs with no edges.

Source §

impl CsrIndex

Source

pub fn has_weights(&self) -> bool

Whether this CSR has any weighted edges.

Source

pub fn out_edge_weight(&self, edge_idx: usize) -> f64

Get the weight of the i-th outbound edge from a node (dense CSR only).

edge_idx is the absolute index into out_targets/out_weights. Returns 1.0 for unweighted graphs.

Source

pub fn in_edge_weight(&self, edge_idx: usize) -> f64

Get the weight of the i-th inbound edge to a node (dense CSR only).

Source

pub fn edge_weight(&self, src: &str, label: &str, dst: &str) -> Option<f64>

Get the weight of a specific outbound edge from src to dst via label.

Checks both dense and buffer. Returns 1.0 if the edge exists but has no weight, or None if the edge doesn’t exist.

Source

pub fn iter_out_edges_weighted( &self, node: LocalNodeId, ) -> impl Iterator<Item = (u32, LocalNodeId, f64)> + '_

Iterate outbound edges of a node with weights: (label_id, dst_id, weight).

Yields from both dense CSR and buffer, excluding deleted edges. Weights are 1.0 for unweighted graphs.

Source

pub fn iter_out_edges_weighted_raw( &self, node: u32, ) -> impl Iterator<Item = (u32, u32, f64)> + '_

Raw u32 variant of iter_out_edges_weighted. In-partition algorithm use only — see Self::iter_out_edges_raw for the safety rationale.

Source §

impl CsrIndex

Source

pub fn traverse_bfs( &self, start_nodes: &[&str], label_filter: Option<&str>, direction: Direction, max_depth: usize, max_visited: usize, frontier_bitmap: Option<&SurrogateBitmap>, ) -> Vec<String>

BFS traversal. Returns all reachable node IDs within max_depth hops.

max_visited caps the number of nodes visited to prevent supernode fan-out explosion. Pass DEFAULT_MAX_VISITED for the standard limit.

frontier_bitmap: when Some, only nodes whose surrogate is present in the bitmap are eligible as traversal targets. Start nodes are not gated — only newly discovered frontier nodes are checked.

Source

pub fn traverse_bfs_with_depth( &self, start_nodes: &[&str], label_filter: Option<&str>, direction: Direction, max_depth: usize, max_visited: usize, ) -> Vec<(String, u8)>

BFS traversal returning nodes with depth information.

max_visited caps the number of nodes visited to prevent supernode fan-out explosion. Pass DEFAULT_MAX_VISITED for the standard limit.

Source

pub fn traverse_bfs_with_depth_multi( &self, start_nodes: &[&str], label_filters: &[&str], direction: Direction, max_depth: usize, max_visited: usize, ) -> Vec<(String, u8)>

BFS traversal with multi-label filter. Empty labels = all edges.

max_visited caps the number of nodes visited to prevent supernode fan-out explosion. Pass DEFAULT_MAX_VISITED for the standard limit.

Source

pub fn shortest_path( &self, src: &str, dst: &str, label_filter: Option<&str>, max_depth: usize, max_visited: usize, frontier_bitmap: Option<&SurrogateBitmap>, ) -> Option<Vec<String>>

Shortest path via bidirectional BFS.

max_visited caps the combined forward+backward visited set to prevent supernode fan-out explosion. Pass DEFAULT_MAX_VISITED for the standard limit.

frontier_bitmap: when Some, only nodes whose surrogate is present in the bitmap are eligible for expansion. Start and end nodes are not gated.

Source