# pkgrank
`pkgrank` ranks nodes in a Cargo dependency graph using centrality metrics.
## Install
```bash
cargo install pkgrank
pkgrank --help
```
## TL;DR
```bash
# Rank local crates by importance (PageRank)
cargo run -- -n 10
# Rank by "who consumes this?" (Consumer PageRank)
cargo run -- --metric consumers-pagerank -n 10
```
## Graph model
- Nodes are Cargo packages (from `cargo metadata`).
- Directed edges are $A \to B$ iff **crate A depends on crate B**.
## Interpretation
- PageRank on the depends-on graph tends to surface **shared dependencies / “substrate” crates**.
- To surface **top-level orchestrators / consumers**, use the “consumer PageRank” (PageRank on the reversed graph).
## Usage (local crate graph)
Analyze the current directory (finds `Cargo.toml` if present):
```bash
cargo run -- -n 25
```
Pick the “top-level orchestrators” view:
```bash
cargo run -- --metric consumers-pagerank -n 25
```
Bound JSON output explicitly:
```bash
cargo run -- analyze --format json --json-limit 200
```
Write per-repo artifacts under `evals/pkgrank/` (super-workspace mode):
```bash
cargo run -- sweep-local --root . --out evals/pkgrank --mode workspace-slice -n 10
```
Triage (artifact-backed summary, same payload as MCP `pkgrank_triage`):
```bash
cargo run -- triage --root . --out evals/pkgrank
```
## JSON output shape (stable wrapper)
For commands that support `--format json`, the JSON is wrapped for forwards-compatible parsing:
```json
{
"schema_version": 1,
"ok": true,
}
```
`pkgrank analyze --format json` also includes explicit bounding metadata:
- `rows_total`: total rows computed
- `rows_returned`: rows included in `rows`
- `truncated`: whether `rows` was truncated
- `json_limit`: the applied limit (if any)
## Usage (module/item graph via cargo-modules)
`pkgrank modules` shells out to [`cargo-modules`](https://github.com/regexident/cargo-modules) and parses its DOT output.
Install once:
```bash
cargo install cargo-modules
```
Defaults are tuned for a “fast, actionable hotspot scan”:
- aggregate by **file**
- include **types + traits**
- hide functions / externs / sysroot
- show a few strongest edges
- cache `cargo-modules` DOT output
Note on **CLI vs MCP defaults**:
- The **CLI** `pkgrank modules` defaults include **types + traits** (and hide functions).
- The **MCP** `pkgrank_modules` tool is more conservative by default (hides fns/types/traits unless you opt in via `preset` or `include_*`), because MCP payloads are easy to blow up accidentally.
- If you want the CLI-like view from MCP, pass a `preset` like `file-api` or `file-full`.
File-level hotspots (explicit, but these are now close to the defaults):
```bash
cargo run -- modules --manifest-path ../Cargo.toml -p walk --lib -n 25
```
Workspace sweep (summary-only):
```bash
cargo run -- modules-sweep --manifest-path ../Cargo.toml -p walk -p innr --lib
```
Use presets when you want a different “view” quickly:
```bash
# Item-level view, more verbose
cargo run -- modules --manifest-path ../Cargo.toml -p walk --lib --preset node-full -n 25
```
Failure semantics:
- Default: **continue on error** and report which packages failed.
- `--fail-fast`: stop on first failure.
- `--continue-on-error=false`: equivalent explicit form.
Caching:
- `modules`/`modules-sweep` cache `cargo modules dependencies` DOT output under `evals/pkgrank/modules_cache/`.
- Use `--cache-refresh` to force regeneration.
## MCP stdio server (Cursor)
`pkgrank mcp-stdio` runs an MCP server over stdio. Stdout is reserved for JSON-RPC frames.
Run:
```bash
cargo run -- mcp-stdio
```
Toolset selection (optional):
- Default: **slim** (small tool surface; “just works” for Cursor)
- Opt-in:
- `PKGRANK_MCP_TOOLSET=full` to expose advanced tools (e.g. module/type graph centrality)
- `PKGRANK_MCP_TOOLSET=debug` to also expose internal artifact-inspection tools
Environment (optional):
- `PKGRANK_ROOT`: default root directory for artifact-backed tools
- `PKGRANK_OUT`: default artifacts directory (default `evals/pkgrank`)
Tools (high level):
- Default (Cursor MCP): `pkgrank_view`, `pkgrank_triage`, `pkgrank_analyze`, `pkgrank_repo_detail`, `pkgrank_crate_detail`, `pkgrank_snapshot`, `pkgrank_compare_runs`
- Advanced (opt-in: `PKGRANK_MCP_TOOLSET=full`): `pkgrank_status`, `pkgrank_modules`, `pkgrank_modules_sweep`
- Debug (opt-in: `PKGRANK_MCP_TOOLSET=debug`): internal artifact-inspection tools (e.g. TLC tables, invariants list, PPR summaries)
## Tests (E2E targets)
- Default test suite is **offline/deterministic** and uses **local real targets** (the dev super-workspace itself).
- URL-backed tests (crates.io crawl) are **opt-in**:
- set `PKGRANK_E2E_NETWORK=1` before running tests.
## Invariants (must not drift)
- Edge meaning: $A \to B$ means “A depends on B”.
- Dependency kind gating: `--dev` / `--build` control whether those edges exist.
- Workspace restriction: “workspace-only” means nodes/edges restricted to the current Cargo workspace members.
## User stories (what this is for)
These are the “real” workflows this tool is meant to serve.
- **Onboarding / orientation**: “What are the most central crates in this workspace? Who are the orchestrators?”
- Use: `pkgrank analyze` (metric `pagerank` vs `consumers-pagerank`) and `pkgrank triage`.
- **Dependency slimming / graph sanity**: “Why is this crate so central / so sticky? What depends on it?”
- Use: `pkgrank analyze --metric consumers-pagerank` + drill into origins and degrees; optionally generate artifacts via `pkgrank view`.
- **Refactor hotspots inside a crate**: “Which files/modules/items are the coupling hotspots?”
- Use: `pkgrank modules` with `--aggregate file` (hot files) or `--aggregate node` (hot items).
- **Workspace sweep**: “Run that hotspot scan across a bunch of crates and summarize failures/results.”
- Use: `pkgrank modules-sweep` (summary-only by default).
- **Shareable artifacts**: “Write an HTML snapshot I can point people at.”
- Use: `pkgrank view` / `pkgrank sweep-local`.
Evidence pointers (qualitative):
- Community demand for “internal dependency graph visualization” (rust-analyzer crate graph is commonly suggested): `https://www.reddit.com/r/rust/comments/x04zko/visualizing_internal_dependencies_as_a_graph/`
- Cursor’s framing of MCP: “connect to external tools and data sources” (so MCP tends to be used for “triage + drilldown”): `https://cursor.com/docs/context/mcp`
- Cargo’s own `cargo tree` docs emphasize dependency display, reverse-deps (`--invert`), and “duplicates” as a common pain point: `https://doc.rust-lang.org/cargo/commands/cargo-tree.html`
## Dependencies / integration notes
- `pkgrank` includes its centrality algorithms internally (PageRank / PPR / betweenness / reachability),
to stay buildable as a standalone repo (no cross-repo path deps).