Barad-dur

The all-seeing repository analyzer. Get health metrics, team insights, and actionable recommendations for any git repository — local or remote.

Named after the Dark Tower of Mordor — because nothing escapes its gaze.

What it does

Barad-dur analyzes git metadata (commits, blame, file tree) and source code complexity, then produces a scored report across 5 categories:

Category	Metrics	Weight
Health	Bus factor, churn hotspots, stale code, file complexity	35%
Coupling	Afferent/efferent coupling, circular deps, change coupling smells	20%
Evolution	Growth trend, refactoring ratio, code age, commit cadence	20%
Git Hygiene	Commit message quality, history cleanliness, gitignore coverage	15%
Team	Knowledge distribution (Gini), contributor activity, ownership clarity, silos, merge patterns	10%
Dependencies (optional)	Dependency drift (libyear), vulnerability detection via OSV	0% by default

Each metric scores 0-100. Category scores are averages. The overall score is a weighted average. The report includes Top Actions — concrete suggestions from the lowest-scoring metrics.

File-level analysis

Beyond git metadata, Barad-dur performs static complexity analysis on source files with language-aware parsing:

Language	Extensions	What's measured
Rust	`.rs`	`pub fn` / `pub async fn`, public struct fields, cyclomatic complexity
JavaScript/TypeScript	`.js`, `.ts`, `.jsx`, `.tsx`, `.mjs`, `.cjs`	Exports, public class members, properties
Python	`.py`	Public defs, `self.*` properties
Go	`.go`	Exported functions (uppercase), exported struct fields
JVM (Java/Kotlin)	`.java`, `.kt`, `.kts`	Public methods, field declarations
CLR (C#)	`.cs`	Public methods, field declarations

This produces per-file metrics: LOC (excluding blanks/comments), cyclomatic complexity (decision points), public methods, and properties. These feed into the hotspot analysis (churn x complexity x size).

Example output

CLI (default)

━━━ Barad-dur ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━
  Repository: myTool on main
  Scope: 18 commits, 2 authors, 32 files
  Window: last 6 months

  Overall Score: ███████████████░░░░░ 77/100

  ▸ Health        ████████░░░░ 72/100
  ▸ Team          ████████░░░░ 74/100
  ▸ Evolution     ████████░░░░ 72/100
  ▸ Git Hygiene   ███████████░ 93/100

  Top Actions:
  1. [Health] Bus factor (score: 20) — Increase code review coverage
  2. [Team] Collaboration patterns (score: 25) — Break directory silos
  3. [Evolution] Growth trend (score: 40) — Monitor growth rate
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━

HTML report (`--html`)

A self-contained single-file HTML report with:

Overview tab — score gauge, radar chart, expandable category cards, top recommendations
Hotspots tab — scatter plot (complexity vs churn, radius = LOC) + sortable table
Coupling tab — temporal coupling pairs ranked by coupling percentage
Ownership tab — per-file ownership bars derived from blame, with author legend
Age tab — file staleness with age bands (Fresh / > 3mo / > 6mo / > 1y)

No external dependencies — all CSS, JS, and data are inlined. Works offline. Dark theme.

Live example — this repo (updated on every push to main)

Installation

From crates.io

cargo install barad-dur

Prerequisites

Rust 1.85+ (curl --proto '=https' --tlsv1.2 -sSf https://sh.rustup.rs | sh)
System deps: build-essential cmake pkg-config libssl-dev (for libgit2)
git in PATH (used for blame collection)

Build from source

git clone git@lab.frogg.it:Edouard_Mangel/barad-dur.git
cd barad-dur
./init.sh          # installs deps + builds
# or manually:
cargo build --release

The binary is at target/release/barad-dur.

Docker

Build a minimal (~31MB) container image from scratch:

# Using the install script (recommended)
./install.sh --docker                        # builds barad-dur:latest
./install.sh --docker -t myorg/barad-dur:v1  # custom image tag

# Or directly with docker
docker build -t barad-dur .

Run it by mounting a repository into /repo:

docker run --rm -v /path/to/repo:/repo barad-dur                          # CLI summary
docker run --rm -v /path/to/repo:/repo barad-dur analyze . -v             # verbose
docker run --rm -v /path/to/repo:/repo barad-dur analyze . --json         # JSON
docker run --rm -v /path/to/repo:/repo -v $(pwd):/output \
  barad-dur analyze . --html -o /output/report.html                       # HTML report

Distributing as a tarball

Export the image for sharing without a registry:

docker save barad-dur:latest | gzip > barad-dur.tar.gz

Load it on another machine:

docker load < barad-dur.tar.gz

Usage

# Analyze current directory (all categories, last 6 months)
barad-dur analyze .

# Verbose output (show individual metrics)
barad-dur analyze . -v
barad-dur analyze . -vv   # also show raw values

# JSON output (for CI/CD integration)
barad-dur analyze . --json
barad-dur analyze . --json --pretty

# HTML report (self-contained, open in browser)
barad-dur analyze . --html
barad-dur analyze . --html -o report.html

# Single category
barad-dur analyze . --health
barad-dur analyze . --team
barad-dur analyze . --evolution
barad-dur analyze . --hygiene

# Custom time window
barad-dur analyze . --since 3months
barad-dur analyze . --since 2024-01-01 --until 2024-12-31
barad-dur analyze . --all   # full history

# Output to file
barad-dur analyze . --json -o report.json

# Cache control
barad-dur analyze . --no-cache     # force re-collection
barad-dur analyze . --cache-only   # fail if no cache

Remote repository analysis

Barad-dur can analyze any remote repository by URL — it clones into a temp directory, runs analysis, and cleans up automatically:

# Analyze a remote repo (HTTPS or SSH)
barad-dur analyze https://github.com/BurntSushi/ripgrep
barad-dur analyze git@github.com:BurntSushi/ripgrep.git

# With GitHub API enrichment (stars, description, language, open issues)
barad-dur analyze https://github.com/BurntSushi/ripgrep --token ghp_xxxxxxxxxxxx

When a --token is provided and the target is a GitHub URL, the report is enriched with metadata from the GitHub API (stars, primary language, description, open issues count). The token needs at least public_repo scope (or repo for private repositories).

Operational notes

Cache: Snapshots are cached at .repository-analysis/snapshot.bin (auto-added to .gitignore). Subsequent runs are instant if HEAD hasn't changed. Use --no-cache to force re-collection, --cache-only to fail if no cache exists.
Progress: In interactive mode (non-JSON, non-HTML), a progress spinner shows collection stages (commits, file tree, blame, complexity, indexes).
Shallow clones: Detected automatically with a warning. For accurate CI/CD results, ensure a full clone (GIT_DEPTH=0 in GitLab CI).

CI/CD Integration

The JSON output is designed for pipeline consumption:

barad-dur:
  stage: analysis
  variables:
    GIT_DEPTH: 0  # full clone for accurate metrics
  script:
    - barad-dur analyze . --json -o report.json
  artifacts:
    paths:
      - report.json

Parse the JSON to enforce thresholds:

SCORE=$(barad-dur analyze . --json | jq '.overall_score')
if [ "$SCORE" -lt 50 ]; then
  echo "Repository health score $SCORE is below threshold"
  exit 1
fi

Use --html -o report.html instead of --json to generate an HTML artifact.

JSON output schema

The JSON output includes these top-level fields:

Field	Type	Description
`repo_name`	string	Repository name
`branch`	string	Current branch
`time_window_months`	number	Analysis window (0 = full history)
`total_commits`	number	Commits in window
`total_authors`	number	Unique authors
`total_files`	number	Files in tree
`overall_score`	number	Weighted score (0-100)
`categories`	array	Per-category scores and metrics
`top_actions`	array	Suggested improvements
`remote_meta`	object \| null	Remote repo metadata (populated for URL targets; enriched with GitHub API data when `--token` is provided)
`file_hotspots`	array	Files ranked by hotspot score (churn x complexity x LOC)
`coupling_pairs`	array	Temporally coupled file pairs with coupling percentage
`author_ownership`	array	Per-file ownership breakdown from blame
`file_ages`	array	File staleness (days since last modification)

Architecture

CLI (clap) → Collector (git2 + git CLI) → RepoSnapshot → Metrics → Scorer → Renderer
                                              ↕                         ↓
                                        Cache (bincode)          CLI / JSON / HTML

Collector: git2 for commits/files, git blame --porcelain (parallel via rayon) for blame, static file analysis for complexity
RepoSnapshot: shared data model with derived indexes (commits by author/file, change pairs, file metrics)
Metrics: pure functions (snapshot) → MetricValue, independently testable
Scorer: weighted category scores + top action suggestions + file-level analysis (hotspots, coupling, ownership, ages)
Renderer: colored CLI, JSON, or self-contained HTML output

See Architecture Decision Record for detailed design rationale.

Development

# Run all tests
cargo test

# Lint
cargo fmt -- --check
cargo clippy --all-targets -- -D warnings

# Run specific test suites
cargo test --lib                    # unit tests
cargo test --test collector_tests   # collector integration tests
cargo test --test integration_tests # end-to-end tests

# Dogfood
cargo run -- analyze . -v

Shipped

v0.5.0 — AST analysis via tree-sitter (Rust, JS, TS, Python, Go, Java, C#), historical trend tracking with backfill, per-blob blame cache
v0.6.0 — Author report cards, cross-tab drill-through links, CI quality gate (barad-dur gate), parallel complexity analysis (9x speedup on large repos)
v0.7.0 — Coupling subcommand with JSON output and HTML force-directed graph, O(n) coupling algorithm, matrix heatmap with dimension filters
v0.8.0 — Coupling category added to overall score, cross-platform release pipeline (Linux + Windows binaries)
v0.9.0 — GitLab Pages landing page, reusable CI template (templates/analyze.yml), dependency age analysis (libyear) and vulnerability detection via OSV
v0.11.0 — Config file support (barad-dur.toml), configurable weights and thresholds, skip-blame flag, exclude patterns
v0.12.0 — GitLab CI Catalog component (templates/analyze.yml), plain-include template with quality gate (templates/barad-dur.yml)

Roadmap

PR/merge request analysis (review turnaround, approval patterns)
GitHub/GitLab API integration for PR data
Multi-repo dashboard (aggregate scores across repositories)
Interactive config editor (see backlog)

License

GPL-3.0-only — see LICENSE.

barad-dur 0.13.0