Expand description
Graph algorithms for centrality, community detection, and structure analysis.
This module provides a collection of graph algorithms for analyzing graph topology, identifying important nodes, and discovering community structure. All algorithms are designed for unweighted, directed graphs and work with both SQLite and Native backends.
§Available Algorithms
§Centrality Algorithms
pagerank- PageRank centrality for identifying influential nodesbetweenness_centrality- Betweenness centrality for finding bridge nodes
§Community Detection
label_propagation- Fast label propagation for community discoverylouvain_communities- Louvain method for modularity optimization
§Structural Analysis
connected_components- Find all connected components in the graphweakly_connected_components- Find weakly connected components (undirected connectivity)strongly_connected_components- Find strongly connected components using Tarjan’s algorithmcycle_basis- Minimal cycle basis for cycle explanation (Paton’s algorithm)cycle_basis_bounded- Cycle basis with bounded enumerationcycle_basis_with_progress- Cycle basis with progress trackingCycleBasisBounds- Configuration for bounds (max_cycles, max_cycle_length, max_per_scc)CycleBasisResult- Result with cycles, SCC decomposition, helpersfind_cycles_limited- Enumerate cycles up to a limitnodes_by_degree- Rank nodes by degree (hub detection)topological_sort- Compute topological ordering of nodes in DAGs
§Dependency Analysis
critical_path- Longest weighted path in DAG for bottleneck identificationcritical_path_with_progress- Critical path with progress trackingCriticalPathResult- Result with path, distance, bottlenecks, slackCriticalPathError- Error for non-DAG graphs
§Reachability Analysis
transitive_closure- Compute all-pairs reachability (can reach relationships)transitive_reduction- Remove redundant edges while preserving reachabilityTransitiveClosureBounds- Bounds for limiting transitive closure computationreachable_from- Forward reachability (what does this affect?)reverse_reachable_from- Backward reachability (what affects this?)can_reach- Point-to-point reachability checkunreachable_from- Find unreachable nodes from entry
§Control Flow Analysis
dominators- Compute dominators and immediate dominator treedominators_with_progress- Dominator computation with progress trackingDominatorResult- Dominance sets and immediate dominator treepost_dominators- Compute post-dominators and immediate post-dominator treepost_dominators_with_progress- Post-dominator computation with progress trackingpost_dominators_auto_exit- Post-dominators with automatic exit detectionPostDominatorResult- Post-dominance sets and immediate post-dominator treedominance_frontiers- Compute dominance frontiers for SSA constructiondominance_frontiers_with_progress- Dominance frontier computation with progress trackingiterated_dominance_frontiers- Compute iterated dominance frontiers for SSA φ-placementDominanceFrontierResult- Dominance frontier setsIteratedDominanceFrontierResult- Iterated dominance frontier result with φ-node placementsnatural_loops- Detect natural loops using back-edge dominance analysisnatural_loops_with_progress- Natural loop detection with progress trackingNaturalLoop- Natural loop with header, back-edges, and bodyNaturalLoopsResult- Natural loop detection result with nesting analysiscontrol_dependence_graph- Compute Control Dependence Graph from post-dominatorscontrol_dependence_from_exit- Compute CDG with automatic exit detectionControlDependenceResult- Control dependence edges and reverse mapping
§Program Analysis
backward_slice- Backward program slicing (what affects this node?)backward_slice_with_progress- Backward slicing with progress trackingforward_slice- Forward program slicing (what does this affect?)forward_slice_with_progress- Forward slicing with progress trackingSliceResult- Result with control_nodes, data_nodes, and slice_nodes
§Call Graph Analysis
collapse_sccs- Collapse SCCs to form condensation DAG for call graph analysiscollapse_sccs_with_progress- SCC collapse with progress trackingSccCollapseResult- Result with node_to_supernode, supernode_members, supernode_edges
§Cut and Partitioning
min_st_cut- Minimum s-t edge cut for fault tolerance analysismin_st_cut_with_progress- Minimum s-t edge cut with progress trackingmin_vertex_cut- Minimum vertex cut for critical node identificationmin_vertex_cut_with_progress- Minimum vertex cut with progress trackingMinCutResult- Result of minimum edge cut computationMinVertexCutResult- Result of minimum vertex cut computationpartition_bfs_level- BFS-level graph partitioning for shardingpartition_greedy- Greedy partitioning with boundary improvementpartition_kway- Size-bounded k-way partitioningpartition_kway_with_progress- K-way partitioning with progress trackingPartitionConfig- Configuration for k-way partitioningPartitionResult- Result of graph partitioning computation
§Path Analysis
enumerate_paths- Enumerate all execution paths using DFS with boundsenumerate_paths_with_progress- Path enumeration with progress trackingenumerate_paths_with_dominance- Path enumeration with dominance-based pruningenumerate_paths_with_dominance_progress- Dominance-constrained enumeration with progress trackingPathEnumerationConfig- Configuration for bounds (max_depth, max_paths, revisit_cap)PathEnumerationDominanceConfig- Configuration for constraint-based pruningPathClassification- Path classification (Normal, Error, Degenerate, Infinite)EnumeratedPath- Single execution path with nodes and classificationPathEnumerationResult- Enumeration result with categorized paths and statisticsPathEnumerationPruningStats- Statistics for dominance-based pruning effectiveness
§Observability Algorithms
happens_before_analysis- Event ordering for concurrent trace analysisimpact_radius- Blast zone computation using bounded reachabilityimpact_radius_with_progress- Impact radius with progress trackingVectorClock- Partial order data structure for happens-before analysisHappensBeforeResult- Result with concurrent pairs and race statisticsImpactRadiusConfig- Configuration for impact radius (max_distance, max_hops, weight_fn)ImpactRadiusResult- Result with blast_zone, distances, boundary, sizeTraceEvent- Runtime trace event representationOperation- Memory operation type (Read/Write)
Note: WeightCallback and default_weight_fn are re-exported from the
dependency analysis section for use with impact radius computation.
§ML / Inference
find_subgraph_patterns- Bounded subgraph isomorphism for pattern matchingfind_subgraph_patterns_with_progress- Subgraph isomorphism with progress trackingSubgraphPatternBounds- Configuration for bounds (max_matches, timeout_ms, max_pattern_nodes)SubgraphMatchResult- Result with matches, patterns_found, computation_time_ms, bounded_hitstructural_similarity- Structural similarity using isomorphism checking and MCS approximationstructural_similarity_with_progress- Structural similarity with progress trackingSimilarityBounds- Configuration for bounds (max_matches, timeout_ms, similarity_threshold)SimilarityResult- Result with isomorphic, mcs_similarity, ged_distance, mcs_sizerewrite_graph_patterns- DPO-style graph rewriting for pattern transformationrewrite_graph_patterns_with_progress- Graph rewriting with progress trackingRewriteRule- Rule specifying pattern, replacement, and interfaceRewriteBounds- Configuration for bounds (max_matches, validation)RewriteResult- Result with rewritten_graph, operations_applied, validation_errors
§Graph Diff
graph_diff()- Structural graph delta between two snapshotsgraph_diff_with_progress()- Graph diff with progress trackingvalidate_refactor()- Refactor validation with safety heuristicsGraphDiffResult- Result with nodes/edges added/removed and similarity metricsNodeDelta- Node delta (nodes_added, nodes_removed)EdgeDelta- Edge delta (edges_added, edges_removed)RefactorValidation- Validation result with is_safe, breaking_changes, warnings
§Security & Compliance
propagate_taint_forward()- Forward taint propagation from sources to sinkspropagate_taint_forward_with_progress()- Forward propagation with progress trackingpropagate_taint_backward()- Backward taint propagation from sink to sourcespropagate_taint_backward_with_progress()- Backward propagation with progress trackingsink_reachability_analysis()- Full vulnerability detection (all sinks)sink_reachability_analysis_with_progress()- Sink analysis with progress trackingdiscover_sources_and_sinks()- Discover sources/sinks using custom callbacksdiscover_sources_and_sinks_default()- Discover using metadata-based detectorsTaintResult- Result with sources, sinks_reached, tainted_nodes, source_sink_pathsSourceCallback- Trait for custom source detectionSinkCallback- Trait for custom sink detectionMetadataSourceDetector- Default source detector using entity metadataMetadataSinkDetector- Default sink detector using entity metadata
§Algorithm Characteristics
| Algorithm | Time Complexity | Best For | Limitations |
|---|---|---|---|
| PageRank | O(k × (V + E)) | Influence ranking | Requires connected graph for best results |
| Betweenness | O(V × (V + E)) | Bridge nodes | Expensive on large graphs |
| Label Propagation | O(k × E) | Fast clustering | Non-deterministic tiebreaking |
| Louvain | O(k × V × E) | Quality communities | Slower than label propagation |
| Connected Components | O(V + E) | Graph connectivity | None |
| Weakly Connected Components | O(V + E) | Undirected connectivity | None |
| Strongly Connected Components | O(V + E) | Cycle detection, loop analysis | None |
| Cycle Basis | O(V + E + C×V) | Cycle explanation, deadlock detection | C = number of cycles |
| Transitive Closure | O(V × (V + E)) | All-pairs reachability | O(V²) space, use bounds for large graphs |
| Transitive Reduction | O(V × (V + E)) | Graph simplification | Most meaningful for DAGs |
| Topological Sort | O(V + E) | Linear ordering of DAGs | Requires DAG (returns cycle error) |
| Critical Path | O(V + E) | Build systems, bottleneck identification | Requires DAG |
| Reachability | O(V + E) | Impact analysis, slicing | None |
| Dominators | O(V²) worst, faster in practice | CFG analysis, SSA construction | Requires single entry |
| Post-Dominators | O(V²) worst, faster in practice | CFG analysis, control dependence | Requires single exit or virtual exit |
| Dominance Frontiers | O(V²) worst | SSA φ-node placement, convergence points | Requires dominators |
| Iterated DF | O(V × iterations) | SSA construction, φ-node placement | Fixed-point iteration |
| Natural Loops | O(E × N) worst | Loop optimization, program analysis | Requires dominators |
| Control Dependence | O(E) after post-dom | Program slicing, impact analysis | Requires post-dominators |
| Program Slicing | O(V + E) | Bug isolation, impact analysis | Requires CDG + reachability |
| SCC Collapse | O(V + E) | Call graph analysis, mutual recursion detection | None |
| Path Enumeration | O(P × L) | Test coverage, symbolic execution | Bounds required for cyclic CFGs |
| Path Enumeration (Dominance) | O(P² × L) amortized | Path pruning for complex CFGs | Requires dominators, control dependence, natural loops |
| Min s-t Edge Cut | O(V × E²) | Fault tolerance, security boundaries | Requires connected source/sink |
| Min Vertex Cut | O(V × E²) | Critical node identification | Requires connected source/sink |
| BFS-Level Partitioning | O(V + E) | Graph sharding, locality | Local optimum based on seeds |
| Greedy Partitioning | O(I × E) | Cut minimization | Local optimum (I = iterations) |
| K-way Partitioning | O(V + E) | Multi-way sharding, load balancing | Requires k >= 2 |
Where:
- V = number of vertices
- E = number of edges
- P = number of paths
- L = average path length
- k = number of iterations (algorithm-dependent)
§Input Requirements
§Graph Connectivity
- Connected components: Works on disconnected graphs (finds all components)
- Weakly connected components: Works on disconnected graphs (finds all components treating edges as undirected)
- Strongly connected components: Works on disconnected graphs (finds all SCCs including trivial)
- Topological sort: Only works on DAGs (returns cycle error with cycle path for cyclic graphs)
- PageRank: Handles disconnected components (splits rank)
- Betweenness: Handles disconnected components (each component separately)
- Label propagation: Works on disconnected graphs (each component independently)
- Louvain: Works on disconnected graphs (each component independently)
- Transitive closure: Works on disconnected graphs (each component independently)
- Transitive reduction: Works on disconnected graphs (each component independently)
§Edge Directionality
All algorithms support directed graphs:
- Pagerank: Follows outgoing edges (link-based ranking)
- Betweenness: Considers both directions for shortest paths
- Label propagation: Uses bidirectional edges (undirected view)
- Louvain: Uses bidirectional edges (undirected view)
§Output Format
§Centrality Algorithms
Return Vec<(NodeId, Score)> sorted by score descending:
let results = pagerank(&graph)?;
// Top 5 most influential nodes
for (node_id, score) in results.iter().take(5) {
println!("Node {}: PageRank = {:.4}", node_id, score);
}§Community Detection
Return Vec<Vec<NodeId>> where each inner vector is a community:
let communities = louvain_communities(&graph)?;
println!("Found {} communities", communities.len());
for (i, community) in communities.iter().enumerate() {
println!("Community {}: {} nodes", i, community.len());
}§Usage Patterns
§When to Use PageRank
- Identify influential nodes in citation networks
- Rank web pages or documents by link structure
- Find key entities in knowledge graphs
- Recommendation systems based on graph structure
§When to Use Betweenness Centrality
- Find bridge nodes connecting communities
- Identify bottlenecks in communication networks
- Detect control points in flow networks
- Analyze information flow in social networks
§When to Use Label Propagation
- Fast community detection on large graphs
- Exploratory analysis where speed matters more than quality
- Baseline comparison for other clustering methods
- Incremental clustering where results update frequently
§When to Use Louvain
- High-quality community detection where modularity matters
- Hierarchical clustering to reveal multi-scale structure
- Research applications requiring reproducible results
- Final clustering when offline computation is acceptable
§Progress Tracking
Long-running algorithms provide _with_progress variants:
use sqlitegraph::{
algo::pagerank_with_progress,
progress::ConsoleProgress
};
let progress = ConsoleProgress::new();
let results = pagerank_with_progress(&graph, progress)?;
// Output: PageRank iteration 10/100...Progress tracking is available for:
pagerank_with_progressbetweenness_centrality_with_progresslouvain_communities_with_progressweakly_connected_components_with_progresstransitive_closure_with_progresstransitive_reduction_with_progressreachable_from_with_progressreverse_reachable_from_with_progressdominators_with_progresspost_dominators_with_progressdominance_frontiers_with_progressnatural_loops_with_progressenumerate_paths_with_progressenumerate_paths_with_dominance_progresscritical_path_with_progresscycle_basis_with_progressbackward_slice_with_progressforward_slice_with_progressfind_subgraph_patterns_with_progress()structural_similarity_with_progress()graph_diff_with_progress()
Use [NoProgress] for zero-overhead progress tracking (default).
Re-exports§
pub use backend::pagerank as pagerank_backend;pub use backend::betweenness_centrality as betweenness_centrality_backend;pub use backend::strongly_connected_components as scc_backend;pub use backend::shortest_path as shortest_path_backend;pub use backend::topological_sort as topological_sort_backend;pub use backend::bfs_traversal;pub use backend::dfs_traversal;pub use backend::k_hop_neighbors;
Modules§
- backend
- Generic graph algorithms using GraphBackend trait
Structs§
- Control
Dependence Result - Control Dependence Graph result.
- Critical
Path Result - Result of critical path analysis on a DAG.
- Cycle
Basis Bounds - Bounds for limiting cycle basis computation.
- Cycle
Basis Result - Result of minimal cycle basis computation.
- Dominance
Frontier Result - Dominance frontier result for a CFG.
- Dominator
Result - Result of dominator computation.
- Edge
Delta - Result of computing edge delta between two graphs.
- Enumerated
Path - A single execution path through the CFG.
- Graph
Diff Result - Complete graph diff result with delta and similarity metrics.
- Happens
Before Result - Result of happens-before analysis on trace events.
- Impact
Radius Config - Configuration for impact radius computation.
- Impact
Radius Result - Result of impact radius computation.
- Iterated
Dominance Frontier Result - Iterated dominance frontier result (for SSA φ-placement).
- Metadata
Sink Detector - Default sink detector using entity metadata.
- Metadata
Source Detector - Default source detector using entity metadata.
- MinCut
Result - Result of minimum s-t edge cut computation.
- MinVertex
CutResult - Result of minimum vertex cut computation.
- Natural
Loop - A natural loop identified by back-edges.
- Natural
Loops Result - Natural loop detection result.
- Node
Delta - Result of computing node delta between two graphs.
- Partition
Config - Configuration for k-way partitioning.
- Partition
Result - Result of graph partitioning computation.
- Path
Enumeration Config - Configuration for path enumeration.
- Path
Enumeration Dominance Config - Extended configuration for dominance-constrained enumeration.
- Path
Enumeration Pruning Stats - Statistics for dominance-based pruning.
- Path
Enumeration Result - Result of path enumeration.
- Post
Dominator Result - Result of post-dominator computation.
- Refactor
Validation - Validation result for refactor checking.
- Rewrite
Bounds - Bounds for limiting graph rewriting operations.
- Rewrite
Result - Result of a graph rewriting operation.
- Rewrite
Rule - A rewrite rule for DPO-style graph transformation.
- SccCollapse
Result - Result of SCC collapse operation (condensation graph).
- SccResult
- Result of strongly connected components decomposition.
- Similarity
Bounds - Bounds for limiting structural similarity computation.
- Similarity
Result - Result of structural similarity computation.
- Slice
Result - Result of a program slicing operation.
- Subgraph
Match Result - Result of subgraph isomorphism search.
- Subgraph
Pattern Bounds - Bounds for limiting subgraph isomorphism search.
- Taint
Result - Result of taint analysis operation.
- Trace
Event - Trace event from concurrent program execution.
- Transitive
Closure Bounds - Bounds for transitive closure computation.
- Vector
Clock - Vector clock for happens-before analysis.
Enums§
- Critical
Path Error - Error type for critical path analysis.
- Operation
- Memory operation type in trace events.
- Path
Classification - Path classification based on termination properties.
- Rewrite
Operation - A single rewrite operation for tracking changes.
- Topo
Error - Error type for topological sort.
Traits§
- Sink
Callback - Callback trait for detecting security-sensitive sinks.
- Source
Callback - Callback trait for detecting taint sources.
Functions§
- backward_
slice - Computes backward program slice: “what can affect this node?”
- backward_
slice_ with_ progress - Computes backward slice with progress tracking.
- betweenness_
centrality - Computes betweenness centrality for all nodes in the graph.
- betweenness_
centrality_ with_ progress - Computes betweenness centrality with progress callback reporting.
- can_
reach - Checks if one node can reach another (point-to-point reachability).
- collapse_
sccs - Collapses SCCs to form a condensation DAG.
- collapse_
sccs_ with_ progress - Collapses SCCs with progress tracking.
- connected_
components - Finds all connected components in the graph using BFS.
- control_
dependence_ from_ exit - Computes control dependence with automatic exit detection.
- control_
dependence_ graph - Computes the Control Dependence Graph (CDG) using Cytron et al. edge-based definition.
- critical_
path - Computes the critical path (longest weighted path) in a DAG.
- critical_
path_ with_ progress - Computes the critical path with progress tracking.
- cycle_
basis - Computes minimal cycle basis using Paton’s algorithm.
- cycle_
basis_ bounded - Computes cycle basis with bounded enumeration.
- cycle_
basis_ with_ progress - Computes cycle basis with progress tracking.
- default_
weight_ fn - Default weight function that returns 1.0 for all edges.
- discover_
sources_ and_ sinks - Discovers all sources and sinks in the graph using custom callbacks.
- discover_
sources_ and_ sinks_ default - Discovers sources and sinks using default metadata-based detectors.
- dominance_
frontiers - Computes dominance frontiers for a CFG using Cytron et al. efficient algorithm.
- dominance_
frontiers_ with_ progress - Computes dominance frontiers with progress tracking.
- dominators
- Computes dominators and immediate dominators for a CFG entry node.
- dominators_
with_ progress - Computes dominators with progress tracking.
- enumerate_
paths - Enumerates all execution paths from entry node using DFS with bounds.
- enumerate_
paths_ with_ dominance - Enumerates all execution paths with dominance-based pruning.
- enumerate_
paths_ with_ dominance_ progress - Enumerates all execution paths with dominance-based pruning and progress tracking.
- enumerate_
paths_ with_ progress - Enumerates all execution paths with progress tracking.
- find_
cycles_ limited - Finds cycles in the graph up to a specified limit.
- find_
subgraph_ patterns - Finds all subgraph isomorphisms of pattern within graph.
- find_
subgraph_ patterns_ with_ progress - Finds all subgraph isomorphisms with progress tracking.
- forward_
slice - Computes forward program slice: “what does this node affect?”
- forward_
slice_ with_ progress - Computes forward slice with progress tracking.
- graph_
diff - Computes structural graph diff between two snapshots.
- graph_
diff_ with_ progress - Computes structural graph diff with progress tracking.
- happens_
before_ analysis - Analyze trace events for concurrent memory access pairs (race detection).
- impact_
radius - Computes the impact radius from a source node using bounded weighted BFS.
- impact_
radius_ with_ progress - Computes the impact radius with progress tracking.
- iterated_
dominance_ frontiers - Computes iterated dominance frontier for SSA φ-placement.
- label_
propagation - Label Propagation algorithm for fast community detection.
- louvain_
communities - Louvain method for community detection via modularity optimization.
- louvain_
communities_ with_ progress - Louvain method for community detection with progress callback reporting.
- min_
st_ cut - Compute minimum s-t edge cut using max-flow min-cut theorem.
- min_
st_ cut_ with_ progress - Compute minimum s-t edge cut with progress tracking.
- min_
vertex_ cut - Compute minimum vertex cut using vertex splitting transformation.
- min_
vertex_ cut_ with_ progress - Compute minimum vertex cut with progress tracking.
- natural_
loops - Computes natural loops by finding back-edges where header dominates tail.
- natural_
loops_ from_ exit - Computes natural loops with automatic entry node detection.
- natural_
loops_ with_ progress - Computes natural loops with progress tracking.
- nodes_
by_ degree - Computes node degrees (total number of incoming + outgoing edges).
- pagerank
- Computes PageRank scores for all nodes in the graph.
- pagerank_
with_ progress - Computes PageRank scores with progress callback reporting.
- partition_
bfs_ level - Partition graph using BFS-level assignment.
- partition_
greedy - Partition graph using greedy iterative boundary improvement.
- partition_
kway - Partition graph into k balanced partitions using BFS growth from seeds.
- partition_
kway_ with_ progress - Partition graph into k balanced partitions with progress tracking.
- post_
dominators - Computes post-dominators and immediate post-dominators for a CFG exit node.
- post_
dominators_ auto_ exit - Computes post-dominators with automatic exit detection.
- post_
dominators_ with_ progress - Computes post-dominators with progress tracking.
- propagate_
taint_ backward - Propagates taint backward from a sink to find affecting sources.
- propagate_
taint_ backward_ with_ progress - Propagates taint backward with progress tracking.
- propagate_
taint_ forward - Propagates taint forward from sources to find reachable sinks.
- propagate_
taint_ forward_ with_ progress - Propagates taint forward with progress tracking.
- reachable_
from - Computes forward reachability from a start node.
- reachable_
from_ with_ progress - Computes forward reachability with progress tracking.
- reverse_
reachable_ from - Computes backward reachability to a target node.
- reverse_
reachable_ from_ with_ progress - Computes backward reachability with progress tracking.
- rewrite_
graph_ patterns - Applies rewrite rules to transform a graph using DPO-style rewriting.
- rewrite_
graph_ patterns_ with_ progress - Applies rewrite rules with progress tracking.
- sink_
reachability_ analysis - Performs sink reachability analysis for all sinks.
- sink_
reachability_ analysis_ with_ progress - Performs sink reachability analysis with progress tracking.
- strongly_
connected_ components - Computes strongly connected components using Tarjan’s algorithm.
- structural_
similarity - Computes structural similarity between two graphs.
- structural_
similarity_ with_ progress - Computes structural similarity with progress tracking.
- topological_
sort - Computes topological ordering of nodes in a directed graph.
- transitive_
closure - Computes transitive closure for all-pairs reachability.
- transitive_
closure_ with_ progress - Computes transitive closure with progress callback reporting.
- transitive_
reduction - Computes transitive reduction to remove redundant edges from a DAG.
- transitive_
reduction_ with_ progress - Computes transitive reduction with progress callback reporting.
- unreachable_
from - Finds nodes unreachable from an entry point.
- validate_
refactor - Validates whether a graph diff represents a safe refactor.
- weakly_
connected_ components - Finds all weakly connected components in the graph using bidirectional BFS.
- weakly_
connected_ components_ with_ progress - Finds all weakly connected components with progress tracking.
Type Aliases§
- Weight
Callback - Weight callback type for edge weighting.