barad-dur 0.18.0

The all-seeing repository analyzer
Documentation

Barad-dur

CI crates.io License: GPL-3.0 Rust

The all-seeing repository analyzer. Get health metrics, team insights, and actionable recommendations for any git repository — local or remote.

Named after the Dark Tower of Mordor — because nothing escapes its gaze.

What it does

Barad-dur analyzes git metadata (commits, blame, file tree) and source code complexity, then produces a scored report across six categories (Dependencies is opt-in):

Category Metrics Weight
Health Bus factor, churn hotspots, stale code, file complexity 35%
Coupling Afferent/efferent coupling, circular deps, change coupling smells 20%
Evolution Growth trend, refactoring ratio, code age, commit cadence 20%
Git Hygiene Commit message quality, history cleanliness, gitignore coverage 15%
Team Knowledge distribution (Gini), contributor activity, ownership clarity, silos, merge patterns 10%
Dependencies (optional) Dependency drift (libyear), vulnerability detection via OSV 0% by default

Each metric scores 0-100. Category scores are averages. The overall score is a weighted average. The report includes Top Actions — concrete suggestions from the lowest-scoring metrics.

File-level analysis

Beyond git metadata, Barad-dur performs static complexity analysis on source files with language-aware parsing:

Language Extensions What's measured
Rust .rs pub fn / pub async fn, public struct fields, cyclomatic complexity
JavaScript/TypeScript .js, .ts, .jsx, .tsx, .mjs, .cjs Exports, public class members, properties
Python .py Public defs, self.* properties
Go .go Exported functions (uppercase), exported struct fields
JVM (Java/Kotlin) .java, .kt, .kts Public methods, field declarations
CLR (C#) .cs Public methods, field declarations

This produces per-file metrics: LOC (excluding blanks/comments), cyclomatic complexity (decision points), public methods, and properties. These feed into the hotspot analysis (churn x complexity x size).

Example output

CLI (default)

━━━ Barad-dur ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━
  Repository: myTool on main
  Scope: 18 commits, 2 authors, 32 files
  Window: last 6 months

  Overall Score: ███████████████░░░░░ 77/100

  ▸ Health        ████████░░░░ 72/100
  ▸ Team          ████████░░░░ 74/100
  ▸ Evolution     ████████░░░░ 72/100
  ▸ Git Hygiene   ███████████░ 93/100

  Top Actions:
  1. [Health] Bus factor (score: 20) — Increase code review coverage
  2. [Team] Collaboration patterns (score: 25) — Break directory silos
  3. [Evolution] Growth trend (score: 40) — Monitor growth rate
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━

HTML report (--html)

A self-contained single-file HTML report with:

  • Overview — score gauge, radar chart, expandable category cards, top recommendations
  • Hotspots — scatter plot (complexity vs churn, radius = LOC) with axis ticks + sortable, filterable table (score, CC, churn, bug-fix commits, LOC); clicking a row or a bubble highlights its counterpart
  • Coupling — temporal coupling pairs ranked by coupling percentage, with auto-exclusion of expected pairs (lock files, test pairs, module indexes) and a per-file instability table (Ca / Ce / I)
  • Graph — interactive force-directed import dependency graph: click-to-focus neighbourhoods, circular dependencies as dashed red edges, directory grouping, min-degree filter, SVG export
  • Ownership — per-file ownership bars derived from blame, with author legend
  • Age — file staleness with age bands (Fresh / > 3mo / > 6mo / > 1y)
  • Treemap — zoomable file treemap (rectangles or circle packing) sized by LOC, colorable by metric, with per-file detail panel
  • Trends — historical score evolution from trends.json (see backfill)
  • Authors — per-contributor report cards
  • Dependencies — libyear drift and CVE findings (when --deps is enabled)
  • Audit — crisis files, directory concentration, dead files, velocity buckets

Tabs are connected: file cells drill through to the Hotspots and Graph views, and the current tab + selected file are mirrored into the URL hash (#tab=graph&file=src/scorer.rs) so any view can be shared as a deep link.

No external dependencies — all CSS, JS, and data are inlined. Works offline. Dark theme with a light-mode toggle.

Live example — this repo (updated on every push to main)

Installation

From crates.io

cargo install barad-dur

Prerequisites

  • Rust 1.85+ (curl --proto '=https' --tlsv1.2 -sSf https://sh.rustup.rs | sh)
  • System deps: build-essential cmake pkg-config libssl-dev (for libgit2)
  • git in PATH (used for blame collection)

Build from source

git clone git@lab.frogg.it:Edouard_Mangel/barad-dur.git
cd barad-dur
./init.sh          # installs deps + builds
# or manually:
cargo build --release

The binary is at target/release/barad-dur.

Docker

Pull the pre-built image

Pre-built images are published to the GitLab container registry on every release:

docker pull lab.frogg.it:5050/edouard_mangel/barad-dur:latest
# or pin to a specific version:
docker pull lab.frogg.it:5050/edouard_mangel/barad-dur:v0.17.3

Run it by mounting a repository into /repo:

docker run --rm -v /path/to/repo:/repo lab.frogg.it:5050/edouard_mangel/barad-dur        # CLI summary
docker run --rm -v /path/to/repo:/repo lab.frogg.it:5050/edouard_mangel/barad-dur analyze . --json
docker run --rm -v /path/to/repo:/repo -v $(pwd):/output \
  lab.frogg.it:5050/edouard_mangel/barad-dur analyze . --html -o /output/report.html

All images are signed with cosign. Verify with the public key in cosign.pub:

cosign verify --key cosign.pub lab.frogg.it:5050/edouard_mangel/barad-dur:latest

Build from source

Build a minimal (~31MB) container image from scratch:

# Using the install script (recommended)
./install.sh --docker                        # builds barad-dur:latest
./install.sh --docker -t myorg/barad-dur:v1  # custom image tag

# Or directly with docker
docker build -t barad-dur .

Run it by mounting a repository into /repo:

docker run --rm -v /path/to/repo:/repo barad-dur                          # CLI summary
docker run --rm -v /path/to/repo:/repo barad-dur analyze . -v             # verbose
docker run --rm -v /path/to/repo:/repo barad-dur analyze . --json         # JSON
docker run --rm -v /path/to/repo:/repo -v $(pwd):/output \
  barad-dur analyze . --html -o /output/report.html                       # HTML report

Distributing as a tarball

Export the image for sharing without a registry:

docker save barad-dur:latest | gzip > barad-dur.tar.gz

Load it on another machine:

docker load < barad-dur.tar.gz

Usage

analyze

# Analyze current directory (all categories, last 6 months)
barad-dur analyze .

# Verbose output (show individual metrics)
barad-dur analyze . -v
barad-dur analyze . -vv   # also show raw values

# JSON output (for CI/CD integration)
barad-dur analyze . --json
barad-dur analyze . --json --pretty

# JSON with trend history and velocity
barad-dur analyze . --json --trend --pretty

# HTML report (self-contained, open in browser)
barad-dur analyze . --html
barad-dur analyze . --html -o report.html
barad-dur analyze . --open   # generate HTML + open immediately in browser

# Single category
barad-dur analyze . --health
barad-dur analyze . --team
barad-dur analyze . --evolution
barad-dur analyze . --hygiene
barad-dur analyze . --deps   # dependency drift + CVE detection (requires network)

# Custom time window
barad-dur analyze . --since 3months
barad-dur analyze . --since 2024-01-01 --until 2024-12-31
barad-dur analyze . --all   # full history

# Output to file
barad-dur analyze . --json -o report.json

# Cache control
barad-dur analyze . --no-cache     # force re-collection
barad-dur analyze . --cache-only   # fail if no cache

# Performance
barad-dur analyze . --skip-blame   # skip blame phase (faster; blame-dependent metrics get defaults)

# Filtering — exclude files from all analysis phases
barad-dur analyze . --exclude '*.resx' --exclude 'i18n/**'   # glob patterns
barad-dur analyze . --exclude-ext jar --exclude-ext min.js    # by extension (case-insensitive, no dot needed)
barad-dur analyze . --no-default-excludes                     # disable built-in translation/resource exclusions

gate

Quality gate for CI/CD — exits non-zero if scores fall below threshold.

barad-dur gate .                              # overall score >= 60 (default)
barad-dur gate . --min-score 70              # custom threshold
barad-dur gate . --category health           # check health category only
barad-dur gate . --category health --category team  # check multiple categories
barad-dur gate . --max-decline 2.0           # also fail if score drops > 2 pts/run on average
barad-dur gate . --skip-blame                # faster check; blame metrics get defaults

init

Generate a .repository-analysis/barad-dur.toml config file with smart defaults.

barad-dur init .               # auto-detect patterns and write config
barad-dur init . --interactive # guided wizard
barad-dur init . --force       # overwrite existing config

watch

Install or remove a post-commit git hook that re-runs analysis automatically.

barad-dur watch .              # install hook
barad-dur watch . --uninstall  # remove hook
barad-dur watch . --skip-blame # install hook without blame (faster commits)

contributors

Detect suspected duplicate contributors and suggest .mailmap entries.

barad-dur contributors .                # show suggested deduplication
barad-dur contributors . --write        # append suggestions to .mailmap
barad-dur contributors . --since 3months

backfill

Walk commit history to populate trends.json with historical analysis snapshots.

barad-dur backfill .            # backfill full history
barad-dur backfill . --no-blame # skip blame (faster)

coupling

Analyze cross-repository coupling — discovers repos under a root directory and ranks pairs by temporal, team, and dependency coupling signals.

barad-dur coupling /path/to/workspace         # analyze all repos
barad-dur coupling . --min-score 50           # only show pairs scoring >= 50
barad-dur coupling . --coupling-window 12h    # 12-hour temporal window (default: 24h)
barad-dur coupling . --since 3months --json   # limit history + JSON output

Remote repository analysis

Barad-dur can analyze any remote repository by URL — it clones into a temp directory, runs analysis, and cleans up automatically:

# Analyze a remote repo (HTTPS or SSH)
barad-dur analyze https://github.com/BurntSushi/ripgrep
barad-dur analyze git@github.com:BurntSushi/ripgrep.git

# With GitHub API enrichment (stars, description, language, open issues)
barad-dur analyze https://github.com/BurntSushi/ripgrep --token ghp_xxxxxxxxxxxx

When a --token is provided and the target is a GitHub URL, the report is enriched with metadata from the GitHub API (stars, primary language, description, open issues count). The token needs at least public_repo scope (or repo for private repositories).

Operational notes

  • Cache: Snapshots are cached at .repository-analysis/snapshot.bin (auto-added to .gitignore). Subsequent runs are instant if HEAD hasn't changed. Use --no-cache to force re-collection, --cache-only to fail if no cache exists.
  • Progress: In interactive mode (non-JSON, non-HTML), a progress spinner shows collection stages (commits, file tree, blame, complexity, indexes).
  • Shallow clones: Detected automatically with a warning. For accurate CI/CD results, ensure a full clone (GIT_DEPTH=0 in GitLab CI).

Configuration

barad-dur init . generates .repository-analysis/barad-dur.toml with smart defaults detected from the repository (translation files, vendored paths, team patterns). You can also run barad-dur init . --interactive for a guided wizard.

# .repository-analysis/barad-dur.toml

[analysis]
skip_blame = false

[analysis.weights]
# Must sum to 100. 'deps' is 0 by default (opt-in via --deps).
health    = 35
team      = 10
evolution = 20
hygiene   = 15
coupling  = 20
deps      = 0

[exclude]
# Glob patterns — same as repeating --exclude on the CLI
patterns = ["i18n/**", "vendor/**", "*.generated.cs"]

# File extensions — same as repeating --exclude-ext on the CLI
# Bare extension or compound (e.g. "min.js"). Case-insensitive, no dot needed.
extensions = ["jar", "min.js", "resx"]

CI/CD Integration

The JSON output is designed for pipeline consumption:

barad-dur:
  stage: analysis
  variables:
    GIT_DEPTH: 0  # full clone for accurate metrics
  script:
    - barad-dur analyze . --json -o report.json
  artifacts:
    paths:
      - report.json

Parse the JSON to enforce thresholds:

SCORE=$(barad-dur analyze . --json | jq '.overall_score')
if [ "$SCORE" -lt 50 ]; then
  echo "Repository health score $SCORE is below threshold"
  exit 1
fi

Use --html -o report.html instead of --json to generate an HTML artifact.

JSON output schema

The JSON output includes these top-level fields:

Field Type Description
repo_name string Repository name
branch string Current branch
time_window_months number Analysis window (0 = full history)
total_commits number Commits in window
total_authors number Unique authors
total_files number Files in tree
overall_score number Weighted score (0-100)
categories array Per-category scores and metrics
top_actions array Suggested improvements
remote_meta object | null Remote repo metadata (populated for URL targets; enriched with GitHub API data when --token is provided)
file_hotspots array Files ranked by hotspot score (churn x complexity x LOC), incl. bug-fix commit counts
coupling_pairs array Temporally coupled file pairs with coupling percentage
per_file_coupling array Per-file afferent/efferent coupling and instability (Ca / Ce / I)
import_edges array Directed static import graph edges (from imports to)
import_cycles array Circular import groups (member file lists, depth 1–2)
author_ownership array Per-file ownership breakdown from blame
file_ages array File staleness (days since last modification)
author_cards array Per-contributor activity report cards
history array Prior analysis snapshots (from trends.json)
dep_ecosystem_reports array Per-ecosystem dependency drift and CVE findings (with --deps)
audit object | null Crisis files, directory concentration, dead files, velocity buckets
score_thresholds object Score band cut-offs (good/warn) used by all renderers

Architecture

CLI (clap) → Collector (git2 + git CLI) → RepoSnapshot → Metrics → Scorer → Renderer
                                              ↕                         ↓
                                        Cache (bincode)          CLI / JSON / HTML
  • Collector: git2 for commits/files, git blame --porcelain (parallel via rayon) for blame, static file analysis for complexity
  • RepoSnapshot: shared data model with derived indexes (commits by author/file, change pairs, file metrics)
  • Metrics: pure functions (snapshot) → MetricValue, independently testable
  • Scorer: weighted category scores + top action suggestions + file-level analysis (hotspots, coupling, ownership, ages)
  • Renderer: colored CLI, JSON, or self-contained HTML output

See Architecture Decision Record for detailed design rationale.

Development

# Run all tests
cargo test

# Lint
cargo fmt -- --check
cargo clippy --all-targets -- -D warnings

# Run specific test suites
cargo test --lib                    # unit tests
cargo test --test collector_tests   # collector integration tests
cargo test --test integration_tests # end-to-end tests

# Dogfood
cargo run -- analyze . -v

Test quality is enforced with mutation testing (cargo-mutants): every push to main runs mutants scoped to the change's diff behind a ≥ 80% kill-rate gate, and a nightly job covers the full codebase.

Shipped

  • v0.5.0 — AST analysis via tree-sitter (Rust, JS, TS, Python, Go, Java, C#), historical trend tracking with backfill, per-blob blame cache
  • v0.6.0 — Author report cards, cross-tab drill-through links, CI quality gate (barad-dur gate), parallel complexity analysis (9× speedup on large repos)
  • v0.7.0 — File age tab (staleness bands), ownership tab (per-file blame breakdown and author legend)
  • v0.8.0 — Temporal coupling matrix heatmap, interactive force-directed graph, multi-repo coupling, O(n) coupling algorithm
  • v0.9.0 — GitLab Pages landing page, reusable CI template (templates/analyze.yml), binary download from package registry
  • v0.10.0 — RLE-aware blame (lower memory on large files), stable median scoring, BARAD_DUR_TEST_REPO env var for test isolation
  • v0.11.0 — CI: parallel fmt/clippy/test, cargo audit, binary artifact reuse between jobs; git2 upgrade (RUSTSEC-2026-0008)
  • v0.12.0 — New Dependencies category: libyear drift + CVE detection via OSV API; supports Cargo, npm, pip, NuGet; results cached 7 days; Dependencies tab in HTML report
  • v0.13.0barad-dur init (config wizard with auto-detection), barad-dur watch (post-commit hook), barad-dur contributors (duplicate detection + .mailmap suggestions)
  • v0.14.0barad-dur backfill (adaptive-sampling history walk), --trend flag (velocity + delta in JSON), --open flag (HTML + browser), --skip-blame for analyze and gate
  • v0.15.0 — File exclusion: --exclude (glob), --exclude-ext (extension), --no-default-excludes; built-in exclusions for translation/resource files (*.resx, *.po, *.xlf, *.strings…)
  • v0.16.0gate --max-decline (fail if score drops faster than N pts/run averaged over 8 runs), coupling --coupling-window (configurable temporal window)
  • v0.17.0[exclude] extensions config key in barad-dur.toml, compound extension support (min.js), case-insensitive extension matching
  • Unreleased — import dependency Graph tab (cycles, click-to-focus, directory grouping, min-degree filter, SVG export), raw import_edges/import_cycles in JSON, ../-import resolution fix, Hotspots polish (Bugs column, axis ticks, filter, bidirectional row↔dot highlight), cross-tab file navigation with URL-hash deep links, per-feature mutation testing gate in CI

Roadmap

  • PR/merge request analysis (review turnaround, approval patterns)
  • GitHub/GitLab API integration for PR data
  • Multi-repo dashboard (aggregate scores across repositories)
  • Interactive config editor — TUI wizard (see backlog)

License

GPL-3.0-only — see LICENSE.