Skip to main content

GraphEngine

Struct GraphEngine 

Source
pub struct GraphEngine { /* private fields */ }
Expand description

In-memory graph engine backed by petgraph, synced to SQLite via codemem-storage.

§Design: intentional in-memory architecture

All graph data (nodes, edges, adjacency) is held entirely in memory using HashMap-based structures. This is deliberate: graph traversals, centrality algorithms, and multi-hop expansions benefit enormously from avoiding disk I/O on every edge follow. The trade-off is higher memory usage, which is acceptable for the typical code-graph sizes this engine targets.

§Memory characteristics

  • nodes: HashMap<String, GraphNode> — ~200 bytes per node (ID, kind, label, namespace, metadata, centrality).
  • edges: HashMap<String, Edge> — ~150 bytes per edge (ID, src, dst, relationship, weight, properties, timestamps).
  • edge_adj: HashMap<String, Vec<String>> — adjacency index mapping node IDs to incident edge IDs for O(degree) lookups.
  • id_to_index: maps string IDs to petgraph NodeIndex values.
  • cached_pagerank / cached_betweenness: centrality caches populated by recompute_centrality().

Use CodememEngine::graph_memory_estimate() for a byte-level sizing estimate based on current node and edge counts.

§Thread safety

GraphEngine is not Sync — it stores mutable graph state without internal locking. Callers in codemem-engine wrap it in Mutex<GraphEngine> (via lock_graph()) to ensure exclusive access. All public &mut self methods (e.g., add_node, recompute_centrality) require the caller to hold the lock.

Implementations§

Source§

impl GraphEngine

Source

pub fn pagerank( &self, damping: f64, iterations: usize, tolerance: f64, ) -> HashMap<String, f64>

Compute PageRank scores for all nodes using power iteration.

  • damping: probability of following an edge (default 0.85)
  • iterations: max number of power iterations (default 100)
  • tolerance: convergence threshold (default 1e-6)

Returns a map from node ID to PageRank score.

Source

pub fn louvain_communities(&self, resolution: f64) -> Vec<Vec<String>>

Detect communities using the Louvain algorithm.

Treats the directed graph as undirected for modularity computation. resolution controls community granularity (1.0 = standard modularity). Returns groups of node IDs, one group per community.

Source

pub fn betweenness_centrality(&self) -> HashMap<String, f64>

Compute betweenness centrality for all nodes using Brandes’ algorithm.

For graphs with more than 1000 nodes, samples sqrt(n) source nodes for approximate computation.

Returns a map from node ID to betweenness centrality score (normalized by 1/((n-1)(n-2)) for directed graphs).

Source

pub fn topological_layers(&self) -> Vec<Vec<String>>

Compute topological layers using Kahn’s algorithm.

Returns layers where all nodes in layer i have no dependencies on nodes in layer i or later. For cyclic graphs, SCCs are condensed into single super-nodes first, then the resulting DAG is topologically sorted.

Each inner Vec contains the node IDs at that layer.

Source

pub fn subgraph_top_n( &self, n: usize, namespace: Option<&str>, kinds: Option<&[NodeKind]>, ) -> (Vec<GraphNode>, Vec<Edge>)

Return top-N nodes by centrality and edges between them. Optionally filter by namespace and/or node kinds.

Source

pub fn louvain_with_assignment(&self, resolution: f64) -> HashMap<String, usize>

Return node-to-community-ID mapping for Louvain.

Source§

impl GraphEngine

Source

pub fn new() -> Self

Create a new empty graph.

Source

pub fn from_storage(storage: &dyn StorageBackend) -> Result<Self, CodememError>

Load graph from storage.

Source

pub fn node_count(&self) -> usize

Get the number of nodes.

Source

pub fn edge_count(&self) -> usize

Get the number of edges.

Source

pub fn expand( &self, start_ids: &[String], max_hops: usize, ) -> Result<Vec<GraphNode>, CodememError>

Multi-hop expansion: given a set of node IDs, expand N hops to find related nodes.

Source

pub fn neighbors(&self, node_id: &str) -> Result<Vec<GraphNode>, CodememError>

Get neighbors of a node (1-hop).

Source

pub fn connected_components(&self) -> Vec<Vec<String>>

Return groups of connected node IDs.

Treats the directed graph as undirected: two nodes are in the same component if there is a path between them in either direction. Each inner Vec<String> is one connected component.

Source

pub fn compute_centrality(&mut self)

Compute degree centrality for every node and update their centrality field.

Degree centrality for node v is defined as: (in_degree(v) + out_degree(v)) / (N - 1) where N is the total number of nodes. When N <= 1, centrality is 0.

Source

pub fn get_all_nodes(&self) -> Vec<GraphNode>

Return all nodes currently in the graph.

Source

pub fn get_node_ref(&self, id: &str) -> Option<&GraphNode>

Return a reference to a node without cloning. Returns None if not found.

Source

pub fn get_edges_ref(&self, node_id: &str) -> Vec<&Edge>

Return references to edges incident on a node without cloning.

This is the zero-copy variant of GraphBackend::get_edges() — same lookup logic via edge_adj, but returns &Edge instead of owned Edge.

Source

pub fn recompute_centrality(&mut self)

Recompute and cache PageRank and betweenness centrality scores.

This should be called after loading the graph (e.g., on server start) and periodically when the graph changes significantly.

Source

pub fn recompute_centrality_with_options(&mut self, include_betweenness: bool)

Recompute centrality caches with control over which algorithms run.

PageRank is always computed. Betweenness centrality is only computed when include_betweenness is true, since it is O(V * E) and can be expensive on large graphs.

Source

pub fn ensure_betweenness_computed(&mut self)

Lazily ensure betweenness centrality has been computed.

If cached_betweenness is empty (e.g., after recompute_centrality_with_options(false)), this method computes and caches betweenness centrality on demand. If the cache is already populated, this is a no-op.

Source

pub fn get_pagerank(&self, node_id: &str) -> f64

Get the cached PageRank score for a node. Returns 0.0 if not found.

Source

pub fn get_betweenness(&self, node_id: &str) -> f64

Get the cached betweenness centrality score for a node. Returns 0.0 if not found.

Source

pub fn raw_graph_metrics_for_memory( &self, memory_id: &str, ) -> Option<RawGraphMetrics>

Collect raw graph metrics for a memory node from both code-graph and memory-graph neighbors.

Code neighbors (sym:, file:, chunk:, pkg:) contribute centrality data (PageRank, betweenness). Memory neighbors (UUID-based) contribute connectivity data (count + edge weights). A function with many linked memories is more important; a memory with many linked memories has richer context.

Returns None if the memory node is not in the graph or has no neighbors.

Trait Implementations§

Source§

impl Default for GraphEngine

Source§

fn default() -> Self

Returns the “default value” for a type. Read more
Source§

impl GraphBackend for GraphEngine

Source§

fn add_node(&mut self, node: GraphNode) -> Result<(), CodememError>

Add a node to the graph.
Source§

fn get_node(&self, id: &str) -> Result<Option<GraphNode>, CodememError>

Get a node by ID.
Source§

fn remove_node(&mut self, id: &str) -> Result<bool, CodememError>

Remove a node by ID.
Source§

fn add_edge(&mut self, edge: Edge) -> Result<(), CodememError>

Add an edge between two nodes.
Source§

fn get_edges(&self, node_id: &str) -> Result<Vec<Edge>, CodememError>

Get edges from a node.
Source§

fn remove_edge(&mut self, id: &str) -> Result<bool, CodememError>

Remove an edge by ID.
Source§

fn bfs( &self, start_id: &str, max_depth: usize, ) -> Result<Vec<GraphNode>, CodememError>

BFS traversal from a start node up to max_depth.
Source§

fn dfs( &self, start_id: &str, max_depth: usize, ) -> Result<Vec<GraphNode>, CodememError>

DFS traversal from a start node up to max_depth.
Source§

fn bfs_filtered( &self, start_id: &str, max_depth: usize, exclude_kinds: &[NodeKind], include_relationships: Option<&[RelationshipType]>, ) -> Result<Vec<GraphNode>, CodememError>

BFS traversal with filtering: exclude certain node kinds and optionally restrict to specific relationship types.
Source§

fn dfs_filtered( &self, start_id: &str, max_depth: usize, exclude_kinds: &[NodeKind], include_relationships: Option<&[RelationshipType]>, ) -> Result<Vec<GraphNode>, CodememError>

DFS traversal with filtering: exclude certain node kinds and optionally restrict to specific relationship types.
Source§

fn shortest_path( &self, from: &str, to: &str, ) -> Result<Vec<String>, CodememError>

Shortest path between two nodes.
Source§

fn stats(&self) -> GraphStats

Get graph statistics.

Auto Trait Implementations§

Blanket Implementations§

Source§

impl<T> Any for T
where T: 'static + ?Sized,

Source§

fn type_id(&self) -> TypeId

Gets the TypeId of self. Read more
Source§

impl<T> Borrow<T> for T
where T: ?Sized,

Source§

fn borrow(&self) -> &T

Immutably borrows from an owned value. Read more
Source§

impl<T> BorrowMut<T> for T
where T: ?Sized,

Source§

fn borrow_mut(&mut self) -> &mut T

Mutably borrows from an owned value. Read more
Source§

impl<T> From<T> for T

Source§

fn from(t: T) -> T

Returns the argument unchanged.

Source§

impl<T> Instrument for T

Source§

fn instrument(self, span: Span) -> Instrumented<Self>

Instruments this type with the provided Span, returning an Instrumented wrapper. Read more
Source§

fn in_current_span(self) -> Instrumented<Self>

Instruments this type with the current Span, returning an Instrumented wrapper. Read more
Source§

impl<T, U> Into<U> for T
where U: From<T>,

Source§

fn into(self) -> U

Calls U::from(self).

That is, this conversion is whatever the implementation of From<T> for U chooses to do.

Source§

impl<T> Same for T

Source§

type Output = T

Should always be Self
Source§

impl<T, U> TryFrom<U> for T
where U: Into<T>,

Source§

type Error = Infallible

The type returned in the event of a conversion error.
Source§

fn try_from(value: U) -> Result<T, <T as TryFrom<U>>::Error>

Performs the conversion.
Source§

impl<T, U> TryInto<U> for T
where U: TryFrom<T>,

Source§

type Error = <U as TryFrom<T>>::Error

The type returned in the event of a conversion error.
Source§

fn try_into(self) -> Result<U, <U as TryFrom<T>>::Error>

Performs the conversion.
Source§

impl<T> WithSubscriber for T

Source§

fn with_subscriber<S>(self, subscriber: S) -> WithDispatch<Self>
where S: Into<Dispatch>,

Attaches the provided Subscriber to this type, returning a WithDispatch wrapper. Read more
Source§

fn with_current_subscriber(self) -> WithDispatch<Self>

Attaches the current default Subscriber to this type, returning a WithDispatch wrapper. Read more