drft-cli 0.7.0

A structural integrity checker for linked file systems
Documentation
---
sources:
  - ../../src/analyses/graph_stats.rs
---

# Graph stats

## The concept

**Graph stats** provides aggregate structural metrics for the entire graph: how many nodes and edges exist, how densely connected they are, and how far apart the most distant nodes are.

- **Density** — the ratio of actual edges to the maximum possible edges. A density of 1.0 means every node links to every other node; near 0 means the graph is sparse.
- **Diameter** — the longest shortest path between any two nodes. A small diameter means information is reachable in few hops.
- **Average path length** — the mean shortest-path distance across all pairs of nodes. Indicates how "close" documents are to each other on average.

## Why it matters for knowledge systems

These metrics give you a high-level health check:

- **High density** often indicates a "link everything to everything" pattern that dilutes the meaning of links.
- **Low density** suggests a sparse, tree-like structure — which may be intentional (layered hierarchy) or a sign of poor cross-referencing.
- **Large diameter** means some documents are many hops apart, which can make navigation difficult.
- **Disconnected graph** (diameter unavailable) means some documents can't reach others at all — see the `connected-components` analysis for details.

## What drft surfaces

### As an analysis (`drft report`)

```bash
drft report graph-stats
```

```
=== graph-stats ===
nodes: 12
edges: 23
density: 0.17
diameter: 4
avg path length: 2.3
```

If the directed graph is not strongly connected (some nodes can't reach others), diameter and average path length are reported as unavailable:

```
diameter: - (disconnected)
avg path length: - (disconnected)
```

JSON output:

```json
{
  "graph-stats": {
    "node_count": 12,
    "edge_count": 23,
    "density": 0.17,
    "diameter": 4,
    "average_path_length": 2.3
  }
}
```

When the graph is disconnected, `diameter` and `average_path_length` are `null` in JSON.

## Algorithm

Density is computed as `|E| / (|V| * (|V| - 1))` for a directed graph. Diameter and average path length are computed via BFS from each real node (all-pairs shortest paths). This is O(V * (V + E)), which is fast for the size of markdown repos.