# How It Works
MorphArch turns source code and Git history into a repository-level dependency
model that you can inspect from the terminal.
The basic idea is simple:
- raw dependency graphs are useful for debugging
- grouped views are better for understanding large systems
MorphArch builds both, but exposes them at different levels of detail.
---
## The Pipeline
### 1. Configuration
MorphArch loads `morpharch.toml` if present.
This can change:
- ignore paths and presets
- scoring weights and thresholds
- boundary rules
- clustering strategy
- semantic families, rules, and clustering constraints
- presentation aliases, kind mode, and color mode
If no config file exists, defaults are used.
### 2. Repository Discovery
MorphArch walks Git history using `gix`.
During discovery it:
- follows the repository's first-parent history
- enumerates commits and file changes
- detects workspace structure
- skips ignored subtrees early
This keeps repeated scans and history replay deterministic and practical.
### 3. Parsing
MorphArch uses language-aware import extraction.
In practice that means:
- safe fast paths that ignore comments and strings
- AST fallback when the fast path is not reliable
- accurate dependency edges for supported languages without plain regex matching
Supported languages include Rust, TypeScript, JavaScript, Python, and Go.
### 4. Dependency Graph Synthesis
Parsed imports are mapped into a repository-level dependency graph.
- nodes represent packages or modules
- edges represent dependency relationships
- weights represent how many concrete imports sit behind a higher-level edge
This graph becomes the basis for scoring, grouping, and inspect mode.
### 5. Architecture Evaluation
MorphArch computes health with six debt dimensions:
- cycle
- layering
- hub
- coupling
- cognitive
- instability
It also applies:
- explicit boundary rules
- scale-aware expectations
- hotspot and blast radius analysis
### 6. Semantic Grouping and Clustering
This is what keeps the TUI usable on large repositories.
MorphArch groups the raw graph into clusters using a hybrid approach:
- semantic grouping from names and paths
- structural grouping when naming is weak
- quality passes that split overly generic fallback clusters
- optional collapsing of external dependency families
Users can override semantic families, rules, hard grouping constraints, and
presentation labels through `morpharch.toml`.
### 7. Persistence and Replay
MorphArch stores scan data in a repo-scoped local cache.
That cache includes:
- snapshot frames for each scanned commit
- checkpoints for efficient reconstruction
- saved scan state for incremental updates
This is what makes repeated scans and timeline replay practical without
starting from scratch every time.
### 8. Presentation Surfaces
The TUI is built from three semantic surfaces.
#### `Map`
Cluster-level repository overview.
- major subsystems
- strongest links
- readable repo shape
#### `Cluster details`
Subsystem detail view.
- diagnosis
- top members or dependencies
- incoming/outgoing link pressure
- selected member or dependency lens
#### `Inspect`
Focused debug view.
- selected member centered
- one-hop inbound/outbound graph context
- raw graph rendering reserved for debugging
This is why MorphArch does not need to keep the full raw graph on screen all
the time.
---
## Why Raw Graphs Are Not the Default
Large node-link graphs become noisy quickly, especially in a terminal.
MorphArch avoids this by:
- starting with clusters instead of individual modules
- summarizing link pressure before drawing it
- using text-first cluster views when geometry would be noisy
- keeping raw graph rendering for inspect mode
That tradeoff is deliberate. The goal is to make repository structure easier to
review without removing graph-level debugging when it is needed.
---
## Technical Stack
| Runtime | Rust |
| Git engine | `gix` |
| Parsing | fast paths + `tree-sitter` fallback |
| Graph algorithms | `petgraph` |
| TUI | `ratatui` + `crossterm` |
| Persistence | SQLite via `rusqlite` |
---
## Performance Characteristics
MorphArch is optimized for repeated scans and historical navigation.
Important techniques:
- subtree-level skipping for unchanged directories
- blob parse caching
- parallel parsing and graph processing
- repo-scoped SQLite checkpoint + delta storage for fast replay
- saved scan state for incremental updates
That is what makes timeline scrubbing and repeated `watch` sessions responsive.