leiden-rs 0.6.0

High-performance Leiden community detection algorithm for graphs in Rust
Documentation

English | 中文

leiden-rs

A Rust implementation of the Leiden algorithm for community detection in graphs, built on the gryf graph library.

Overview

The Leiden algorithm is a widely-used method for detecting communities in networks. It improves upon the Louvain algorithm by guaranteeing well-connected communities: every node in a community is guaranteed to be reachable from every other node through paths that stay entirely within the community. This is achieved through a three-phase process of local moving, refinement, and aggregation, iterated until convergence.

Features

  • Full three-phase Leiden algorithm: local moving, refinement, and aggregation
  • Modularity (Newman-Girvan) and CPM (Constant Potts Model) quality functions
  • RBConfiguration (Reichardt-Bornholdt with configuration null model) quality function
  • RBER (Reichardt-Bornholdt with Erdős-Rényi null model) quality function
  • QualityFunction trait for pluggable custom quality functions
  • Seeded RNG for deterministic, reproducible results
  • Rayon-based parallelism with graph coloring for parallel local moving on large graphs
  • Supports undirected weighted graphs, self-loops, disconnected components, and isolated nodes
  • Sparse and dense graph representations via gryf
  • Optional CLI tool for edge list input/output
  • Criterion benchmarks and comparison benchmarks against fa-leiden-cd
  • Resolution profile: linear scan and bisection-based sweep across gamma values
  • LFR benchmark graph generator with known ground-truth community structure
  • Hierarchical partition output: inspect communities at each aggregation level
  • Evaluation metrics: NMI, ARI, conductance, coverage, internal density
  • Multiplex/multilayer network optimization: detect communities across multiple graph layers simultaneously
  • WebAssembly support: compile and run in browsers via the wasm feature flag

Quick Start

Add leiden-rs to your Cargo.toml:

[dependencies]
leiden-rs = "0.6"
gryf = "0.2"

Detect communities in a graph:

use gryf::{Graph, core::marker::Undirected};
use leiden_rs::{Leiden, LeidenConfig};

let mut graph: Graph<i32, f64, Undirected> = Graph::new_undirected();
let a = graph.add_vertex(1);
let b = graph.add_vertex(2);
let c = graph.add_vertex(3);
graph.add_edge(&a, &b, 1.0);
graph.add_edge(&b, &c, 1.0);

let leiden = Leiden::new(LeidenConfig::default());
let partition = leiden.run(&graph).expect("leiden failed");
println!("Found {} communities", partition.num_communities());

Configuration

LeidenConfig controls algorithm behavior:

Field Type Default Description
max_iterations usize 100 Maximum number of Leiden iterations (local move + refine + aggregate cycles)
resolution f64 1.0 Resolution parameter gamma; higher values favor smaller communities
seed Option<u64> None RNG seed for reproducible results; None uses a random seed
quality QualityType Modularity Quality function to optimize (Modularity, CPM, RBConfiguration, or RBER)
epsilon f64 1e-10 Convergence threshold; stop when quality improvement is below this value
max_comm_size usize 0 Maximum number of nodes per community (0 = unlimited)
use leiden_rs::{Leiden, LeidenConfig, QualityType};

let config = LeidenConfig {
    max_iterations: 200,
    resolution: 0.8,
    seed: Some(42),
    quality: QualityType::CPM,
    epsilon: 1e-8,
    max_comm_size: 0,
    ..Default::default()
};

let leiden = Leiden::new(config);
let partition = leiden.run(&graph).expect("leiden failed");

You can also use the builder pattern:

use leiden_rs::LeidenConfig;

let config = LeidenConfig::builder()
    .resolution(0.8)
    .seed(42)
    .build();

Quality Functions

The algorithm optimizes a quality function that measures the goodness of a partition. Four built-in functions are provided:

Modularity (Newman-Girvan):

Q = sum_c [ e_c / m  -  gamma * (k_c / (2m))^2 ]

Where e_c is the total edge weight inside community c, m is the total edge weight in the graph, k_c is the total degree of nodes in community c, and gamma is the resolution parameter.

CPM (Constant Potts Model):

H = sum_c [ e_c  -  gamma * n_c * (n_c - 1) / 2 ]

Where n_c is the number of nodes in community c. CPM avoids the resolution limit inherent in modularity and allows the resolution parameter to directly control expected community sizes.

RBConfiguration (Reichardt-Bornholdt with configuration model null):

Q = sum_c [ e_c - gamma * K_c^2 / (4m) ]

Where K_c is the total degree of nodes in community c. This is mathematically equivalent to Modularity with a configurable resolution parameter. It uses the configuration model (expected edges proportional to k_i * k_j / 2m) as the null model.

RBER (Reichardt-Bornholdt with Erdős-Rényi null model):

Q = sum_c [ e_c - gamma * p * n_c * (n_c - 1) / 2 ]

Where p = 2m / (N*(N-1)) is the graph density. Similar to CPM but uses the Erdős-Rényi random graph as the null model, making the effective resolution scale with graph density.

Error Handling

Functions that can fail return Result<T, LeidenError>:

  • GraphData::from_gryf_graph() returns Result<GraphData, LeidenError> — validates edge weights are finite and non-negative.
  • Leiden::run() returns Result<Partition, LeidenError> — propagates input validation errors.

Error variants:

  • LeidenError::InvalidEdgeWeight { weight } — an edge weight is NaN, infinite, or negative.
  • LeidenError::InconsistentStructure { message } — CSR components have inconsistent lengths or bounds.
  • LeidenError::InvalidParameter { message } — an algorithm parameter is invalid (e.g., mismatched layer node counts).

Advanced Features

Resolution Profile

The library provides two methods for scanning community structure across resolution parameter values:

Linear scan — runs Leiden at evenly spaced gamma values:

use leiden_rs::{resolution_scan, QualityType};

let entries = resolution_scan(&graph, QualityType::CPM, (0.0, 1.0), 20, Some(42))?;
for entry in &entries {
    println!("gamma={:.3}: {} communities (quality={:.4})",
             entry.resolution, entry.num_communities, entry.quality);
}

Bisection profile — efficiently finds only the gamma values where the partition changes (like leidenalg):

use leiden_rs::{resolution_profile, QualityType};

let profile = resolution_profile(&graph, QualityType::CPM, (0.0, 1.0), Some(42), 1e-3, 1.0)?;
for entry in &profile {
    println!("gamma={:.3}: {} communities", entry.resolution, entry.num_communities);
}

Hierarchical Output

Inspect community structure at each aggregation level:

use leiden_rs::{Leiden, LeidenConfig};

let leiden = Leiden::new(LeidenConfig { seed: Some(42), ..Default::default() });
let h = leiden.run_hierarchical(&graph)?;

println!("{} aggregation levels", h.num_levels());
for (i, level) in h.levels.iter().enumerate() {
    println!("Level {}: {} nodes → {} communities (quality={:.4})",
             i, level.node_count, level.num_communities, level.quality);
}

// Query community of a specific node at any level
let comm = h.community_of_at_level(0, 0); // node 0, finest level

Multiplex Networks

Detect communities across multiple graph layers simultaneously. All layers must share the same node set. The algorithm optimizes a weighted sum of quality functions: Q = Σ_l w_l * Q_l.

use leiden_rs::{run_multiplex, MultiplexConfig, GraphData};

// Two layers representing different relationship types on the same node set
let layer1 = GraphData::from_edgelist(&[
    (0, 1, 1.0), (1, 2, 1.0), (0, 2, 1.0),
    (3, 4, 1.0), (4, 5, 1.0), (3, 5, 1.0),
    (2, 3, 1.0),
], 6).unwrap();

let layer2 = GraphData::from_edgelist(&[
    (0, 1, 1.0), (1, 2, 1.0), (0, 2, 1.0),
    (3, 4, 1.0), (4, 5, 1.0), (3, 5, 1.0),
    (1, 4, 1.0),
], 6).unwrap();

let config = MultiplexConfig {
    seed: Some(42),
    layer_weights: vec![1.0, 1.0],  // equal weight for both layers
    ..Default::default()
};

let result = run_multiplex(&[layer1, layer2], &config)?;
println!("Found {} communities (quality: {:.4})",
         result.partition.num_communities(), result.quality);
println!("Per-layer qualities: {:?}", result.layer_qualities);

Layer weights: Use different weights to prioritize certain layers. Higher weights give more influence. Negative weights invert the quality contribution (push nodes apart), useful for repulsive inter-layer coupling.

Use cases: social networks with multiple relationship types, temporal network snapshots, multimodal transportation graphs, brain connectivity across frequency bands.

Evaluation Metrics

Compare detected communities against ground truth or measure community quality:

use leiden_rs::{nmi, ari, conductance, coverage};

// NMI and ARI compare two partitions (e.g., detected vs ground truth)
let similarity = nmi(&ground_truth, detected_partition.as_slice()); // 0..1
let agreement = ari(&ground_truth, detected_partition.as_slice());  // -0.5..1

// Community-level metrics
let cond = conductance(&graph_data, &partition); // per-community conductance
let cov = coverage(&graph_data, &partition);     // fraction of intra-community edges

LFR Benchmark

Generate synthetic graphs with known community structure for evaluating detection algorithms:

use leiden_rs::{generate_lfr_graph, LfrConfig};

let lfr = generate_lfr_graph(LfrConfig {
    n: 250,
    tau1: 3.0,         // degree distribution exponent
    tau2: 1.5,         // community size exponent
    mu: 0.1,           // mixing parameter (0 = perfect communities)
    average_degree: Some(5.0),
    min_community: Some(20),
    seed: Some(42),
    ..Default::default()
})?;

println!("Generated {} nodes, {} edges, {} communities",
         lfr.node_count, lfr.edges.len(),
         lfr.community_sizes.len());
// lfr.ground_truth contains the true community assignment per node

Platform Support

CLI Usage

The leiden-cli binary reads an edge list from stdin and writes community assignments to stdout. It is feature-gated behind the cli feature (enabled by default).

cargo run --bin leiden-cli -- [options]

Options:

[INPUT]                        Input edge list file (stdin if not specified)
--quality <modularity|cpm|rbconfiguration|rber>   Quality function (default: modularity)
--resolution <gamma>          Resolution parameter (default: 1.0)
--seed <u64>                  RNG seed for reproducibility
--iterations <n>              Maximum iterations (default: 100)
--epsilon <float>             Convergence threshold (default: 1e-10)
--max-comm-size <n>           Maximum nodes per community (default: 0, unlimited)
-o, --output <FILE>           Output file (stdout if not specified)

Example:

echo -e "1 2 1.0\n2 3 1.0\n3 4 0.5\n4 1 0.3" | cargo run --bin leiden-cli -- --seed 42

Input format: one edge per line as source target weight. Node IDs are remapped to dense indices internally.

Feature Flags

Feature Default Description
cli Yes Enables the leiden-cli binary and pulls in clap
rayon Yes Enables Rayon-based parallel local moving on large graphs
serde No Enables Serialize/Deserialize for Partition, LeidenConfig, and QualityType
wasm No Enables WebAssembly bindings via wasm-bindgen (disables parallelism)

To use as a library without the CLI dependency:

[dependencies]
leiden-rs = { version = "0.6", default-features = false, features = ["rayon"] }

WebAssembly

Compile to WebAssembly for browser use:

rustup target add wasm32-unknown-unknown
cargo build --no-default-features --features wasm --target wasm32-unknown-unknown

The wasm feature provides a JavaScript-friendly API via wasm-bindgen:

use leiden_rs::wasm::leiden_from_edgelist;

// edges: flat array [src0, dst0, weight0, src1, dst1, weight1, ...]
let communities = leiden_from_edgelist(&edges, num_nodes, Some(42));

Implementation Notes

  • Graph representation: Uses gryf's CSR-based sparse graph representation for efficient neighbor iteration.
  • Parallelism: Graph coloring-based parallel local moving with rayon. Nodes are colored so that same-color nodes form independent sets and can be moved simultaneously. Automatic fallback to sequential for small graphs (< 500 nodes).
  • Determinism: When a seed is provided, all random decisions are reproducible across runs. Without a seed, results vary between runs.
  • Convergence: Iteration stops when the total quality of the partition fails to improve beyond an epsilon threshold of 1e-10, preventing infinite loops from floating-point noise.
  • Move components: The refinement phase uses the move-components subgraph to guarantee community connectivity.

Benchmarks

Benchmarks use Criterion and compare against the fa-leiden-cd crate at three graph sizes:

Benchmark Graph size
Small 2 clusters, 5 nodes each (10 nodes, 21 edges)
Medium 4 clusters, 25 nodes each (100 nodes, 1206 edges)
Large 4 clusters, 50 nodes each (200 nodes, 4906 edges)

Run benchmarks:

cargo bench

Testing

The project includes 88 unit tests covering quality functions, resolution profiles, LFR generation, hierarchical output, evaluation metrics (NMI, ARI), multiplex networks, edge cases (empty graphs, single-node, disconnected, isolated nodes, self-loops), and convergence behavior.

cargo test

Acknowledgments

Special thanks to the gryf project for providing a clean, idiomatic Rust graph library that made this implementation possible.

References

Traag, V.A., Waltman, L. & van Eck, N.J. (2019). From Louvain to Leiden: guaranteeing well-connected communities. Scientific Reports, 9, 5233. https://doi.org/10.1038/s41598-019-41695-z