Kimün (km)
Kimün means "knowledge" or "wisdom" in Mapudungun, the language of the Mapuche people.
A fast command-line tool for code analysis, written in Rust. Run km score on any project to get an overall health grade (A++ to F--) across six quality dimensions — maintainability, complexity, duplication, indentation depth, Halstead effort, and file size — with a list of the files that need the most attention.
Beyond the aggregate score, Kimün provides 12 specialized commands:
- Static metrics — lines of code by language (cloc-compatible), duplicate detection (Rule of Three), Halstead complexity, cyclomatic complexity, indentation complexity, and two Maintainability Index variants (Visual Studio and verifysoft).
- Git-based analysis — hotspot detection (change frequency × complexity, Thornhill method), code ownership / knowledge maps via
git blame, and temporal coupling between files that change together. - AI-powered analysis — optional integration with Claude to run all tools and produce a narrative report.
Installation
This installs the km binary.
Commands
km loc -- Count lines of code
Run on the current directory:
Run on a specific path:
Options:
| Flag | Description |
|---|---|
-v, --verbose |
Show summary stats (files read, unique, ignored, elapsed time) |
--json |
Output as JSON |
Example output:
────────────────────────────────────────────────────────────────────
Language Files Blank Comment Code
────────────────────────────────────────────────────────────────────
Rust 5 120 45 850
TOML 1 2 0 15
────────────────────────────────────────────────────────────────────
SUM: 6 122 45 865
────────────────────────────────────────────────────────────────────
km dups -- Detect duplicate code
Finds duplicate code blocks across files using a sliding window approach. Applies the Rule of Three: duplicates appearing 3+ times are marked as CRITICAL (refactor recommended), while those appearing twice are TOLERABLE.
Test files and directories are excluded by default, since tests often contain intentional repetition.
Options:
| Flag | Description |
|---|---|
-r, --report |
Show detailed report with duplicate locations and code samples |
--show-all |
Show all duplicate groups (default: top 20) |
--min-lines N |
Minimum lines for a duplicate block (default: 6) |
--include-tests |
Include test files in analysis (excluded by default) |
--json |
Output as JSON |
Example summary output:
────────────────────────────────────────────────────────────────────
Duplication Analysis
Total code lines: 3247
Duplicated lines: 156
Duplication: 4.8%
Duplicate groups: 12
Files with duplicates: 8
Largest duplicate: 18 lines
Rule of Three Analysis:
Critical duplicates (3+): 7 groups, 96 lines
Tolerable duplicates (2x): 5 groups, 60 lines
Assessment: Good
────────────────────────────────────────────────────────────────────
Example detailed output (--report):
────────────────────────────────────────────────────────────────────
[1] CRITICAL: 18 lines, 3 occurrences (36 duplicated lines)
src/parser.rs:45-62
src/formatter.rs:120-137
src/validator.rs:89-106
Sample:
fn process_tokens(input: &str) -> Vec<Token> {
let mut tokens = Vec::new();
for line in input.lines() {
...
────────────────────────────────────────────────────────────────────
[2] TOLERABLE: 12 lines, 2 occurrences (12 duplicated lines)
src/main.rs:100-111
src/cli.rs:200-211
Sample:
match result {
Ok(value) => {
...
────────────────────────────────────────────────────────────────────
Excluded test patterns
By default, km dups skips files matching common test conventions:
- Directories:
tests/,test/,__tests__/,spec/ - By extension:
*_test.rs,*_test.go,test_*.py,*.test.js,*.spec.ts,*Test.java,*_test.cpp, and more
Use --include-tests to analyze test files as well.
km indent -- Indentation complexity
Measures indentation-based complexity per file: standard deviation of indentation depths and maximum depth. Higher stddev suggests more complex control flow.
Options:
| Flag | Description |
|---|---|
--json |
Output as JSON |
--include-tests |
Include test files in analysis (excluded by default) |
km hal -- Halstead complexity metrics
Computes Halstead complexity metrics per file by extracting operators and operands from source code.
Metrics
| Symbol | Metric | Formula | Description |
|---|---|---|---|
| n1 | Distinct operators | -- | Unique operators in the code |
| n2 | Distinct operands | -- | Unique operands in the code |
| N1 | Total operators | -- | Total operator occurrences |
| N2 | Total operands | -- | Total operand occurrences |
| n | Vocabulary | n1 + n2 | Size of the "alphabet" used |
| N | Length | N1 + N2 | Total number of tokens |
| V | Volume | N * log2(n) | Size of the implementation |
| D | Difficulty | (n1/2) * (N2/n2) | Error proneness |
| E | Effort | D * V | Mental effort to develop |
| B | Bugs | V / 3000 | Estimated delivered bugs |
| T | Time | E / 18 seconds | Estimated development time |
Higher effort, volume, and bugs indicate more complex and error-prone code.
Options:
| Flag | Description |
|---|---|
--json |
Output as JSON (includes all metrics) |
--include-tests |
Include test files in analysis (excluded by default) |
--top N |
Show only the top N files (default: 20) |
--sort-by METRIC |
Sort by effort, volume, or bugs (default: effort) |
Example output:
Halstead Complexity Metrics
──────────────────────────────────────────────────────────────────────────────
File n1 n2 N1 N2 Volume Effort Bugs
──────────────────────────────────────────────────────────────────────────────
src/loc/counter.rs 139 116 3130 1169 34367.7 24070888 11.46
src/main.rs 37 43 520 185 4457.0 354743 1.49
──────────────────────────────────────────────────────────────────────────────
Total (2 files) 3650 1354 38824.7 24425631 12.95
Supported languages
Rust, Python, JavaScript, TypeScript, Go, C, C++, C#, Java, Objective-C, PHP, Dart, Ruby, Kotlin, Swift, Shell (Bash/Zsh).
km cycom -- Cyclomatic complexity
Computes cyclomatic complexity per file and per function by counting decision points (if, for, while, match, &&, ||, etc.).
Options:
| Flag | Description |
|---|---|
--json |
Output as JSON |
--include-tests |
Include test files in analysis (excluded by default) |
--top N |
Show only the top N files (default: 20) |
--min-complexity N |
Minimum max-complexity to include a file (default: 1) |
--per-function |
Show per-function breakdown |
km mi -- Maintainability Index (Visual Studio variant)
Computes the Maintainability Index per file using the Visual Studio formula. MI is normalized to a 0–100 scale with no comment-weight term.
Formula
MI = MAX(0, (171 - 5.2 * ln(V) - 0.23 * G - 16.2 * ln(LOC)) * 100 / 171)
Where V = Halstead Volume, G = cyclomatic complexity, LOC = code lines.
Thresholds
| MI Score | Level | Meaning |
|---|---|---|
| 20–100 | green | Good maintainability |
| 10–19 | yellow | Moderate maintainability |
| 0–9 | red | Low maintainability |
Options:
| Flag | Description |
|---|---|
--json |
Output as JSON |
--include-tests |
Include test files in analysis (excluded by default) |
--top N |
Show only the top N files (default: 20) |
--sort-by METRIC |
Sort by mi (ascending), volume, complexity, or loc (default: mi) |
Example output:
Maintainability Index (Visual Studio)
──────────────────────────────────────────────────────────────────────
File Volume Cyclo LOC MI Level
──────────────────────────────────────────────────────────────────────
src/loc/counter.rs 32101.6 115 731 0.0 red
src/main.rs 11189.6 16 241 17.5 yellow
src/loc/report.rs 6257.0 13 185 22.2 green
──────────────────────────────────────────────────────────────────────
Total (3 files) 1157 13.2
km miv -- Maintainability Index (verifysoft variant)
Computes the Maintainability Index per file. MI combines Halstead Volume, Cyclomatic Complexity, lines of code, and comment ratio into a single maintainability score.
This is the verifysoft.com variant, which includes a comment-weight term (MIcw) that rewards well-commented code.
Formula
MIwoc = 171 - 5.2 * ln(V) - 0.23 * G - 16.2 * ln(LOC)
MIcw = 50 * sin(sqrt(2.46 * radians(PerCM)))
MI = MIwoc + MIcw
Where V = Halstead Volume, G = cyclomatic complexity, LOC = code lines, PerCM = comment percentage (converted to radians).
Thresholds
| MI Score | Level | Meaning |
|---|---|---|
| 85+ | good | Easy to maintain |
| 65–84 | moderate | Reasonable maintainability |
| <65 | difficult | Hard to maintain |
Options:
| Flag | Description |
|---|---|
--json |
Output as JSON |
--include-tests |
Include test files in analysis (excluded by default) |
--top N |
Show only the top N files (default: 20) |
--sort-by METRIC |
Sort by mi (ascending), volume, complexity, or loc (default: mi) |
Example output:
Maintainability Index
────────────────────────────────────────────────────────────────────────────────
File Volume Cyclo LOC Cmt% MIwoc MI Level
────────────────────────────────────────────────────────────────────────────────
src/loc/counter.rs 32101.6 115 731 3.6 -16.2 2.8 difficult
src/main.rs 8686.7 14 204 14.6 34.5 68.2 moderate
src/util.rs 2816.9 18 76 9.5 55.4 84.7 moderate
────────────────────────────────────────────────────────────────────────────────
Total (3 files) 1011 51.9
km hotspots -- Hotspot analysis
Finds hotspots: files that change frequently AND have high complexity. Based on Adam Thornhill's method ("Your Code as a Crime Scene").
Formula
Score = Commits × Complexity
Files with high scores concentrate risk — they are both change-prone and complex, making them the highest-value refactoring targets.
By default, complexity is measured by total indentation (sum of logical indentation levels across all code lines), following Thornhill's original method from "Your Code as a Crime Scene". Use --complexity cycom for cyclomatic complexity instead.
Requires a git repository. Merge commits are excluded from the count.
Options:
| Flag | Description |
|---|---|
--json |
Output as JSON |
--include-tests |
Include test files in analysis (excluded by default) |
--top N |
Show only the top N files (default: 20) |
--sort-by METRIC |
Sort by score, commits, or complexity (default: score) |
--since DURATION |
Only consider commits since this time (e.g. 30d, 6m, 1y) |
--complexity METRIC |
indent (default, Thornhill) or cycom (cyclomatic) |
Duration units: d (days), m (months, approx. 30 days), y (years, approx. 365 days).
Example output (default — indentation complexity):
Hotspots (Commits × Total Indent Complexity)
──────────────────────────────────────────────────────────────────────────────
File Language Commits Total Indent Score
──────────────────────────────────────────────────────────────────────────────
src/main.rs Rust 18 613 11034
src/loc/counter.rs Rust 7 1490 10430
src/dups/detector.rs Rust 7 1288 9016
src/dups/mod.rs Rust 9 603 5427
src/report/mod.rs Rust 4 998 3992
──────────────────────────────────────────────────────────────────────────────
Score = Commits × Total Indentation (Thornhill method).
High-score files are change-prone and complex — prime refactoring targets.
Example output (--complexity cycom):
Hotspots (Commits × Cyclomatic Complexity)
──────────────────────────────────────────────────────────────────────────────
File Language Commits Cyclomatic Score
──────────────────────────────────────────────────────────────────────────────
src/loc/counter.rs Rust 7 115 805
src/dups/mod.rs Rust 9 44 396
src/main.rs Rust 18 21 378
src/cycom/analyzer.rs Rust 4 92 368
src/dups/detector.rs Rust 7 46 322
──────────────────────────────────────────────────────────────────────────────
Score = Commits × Cyclomatic Complexity.
High-score files are change-prone and complex — prime refactoring targets.
km knowledge -- Code ownership analysis
Analyzes code ownership patterns via git blame (knowledge maps). Based on Adam Thornhill's method ("Your Code as a Crime Scene" chapters 8-9).
Identifies bus factor risk and knowledge concentration per file. Generated files (lock files, minified JS, etc.) are automatically excluded.
Risk levels
| Risk | Condition | Meaning |
|---|---|---|
| CRITICAL | 1 person owns >80% | High bus factor risk |
| HIGH | 1 person owns 60-80% | Significant concentration |
| MEDIUM | 2-3 people own >80% combined | Moderate concentration |
| LOW | Well-distributed | Healthy ownership |
Knowledge loss detection
Use --since to define "recent activity". If the primary owner of a file has no commits in that period, the file is flagged with knowledge loss risk. Use --risk-only to show only those files.
Options:
| Flag | Description |
|---|---|
--json |
Output as JSON |
--include-tests |
Include test files in analysis (excluded by default) |
--top N |
Show only the top N files (default: 20) |
--sort-by METRIC |
Sort by concentration, diffusion, or risk (default: concentration) |
--since DURATION |
Define recent activity window for knowledge loss (e.g. 6m, 1y, 30d) |
--risk-only |
Show only files with knowledge loss risk |
Example output:
Knowledge Map — Code Ownership
──────────────────────────────────────────────────────────────────────────────
File Language Lines Owner Own% Contrib Risk
──────────────────────────────────────────────────────────────────────────────
src/loc/counter.rs Rust 731 E. Diaz 94% 2 CRITICAL
src/main.rs Rust 241 E. Diaz 78% 3 HIGH
src/walk.rs Rust 145 E. Diaz 55% 5 MEDIUM
──────────────────────────────────────────────────────────────────────────────
Files with knowledge loss risk (primary owner inactive): 1
src/legacy.rs (Former Dev)
km tc -- Temporal coupling analysis
Analyzes temporal coupling between files via git history. Based on Adam Thornhill's method ("Your Code as a Crime Scene" ch. 7): files that frequently change together in the same commits have implicit coupling, even without direct imports.
Formula
Coupling strength = shared_commits / min(commits_a, commits_b)
Coupling levels
| Strength | Level | Meaning |
|---|---|---|
| >= 0.5 | STRONG | Files change together most of the time |
| 0.3-0.5 | MODERATE | Noticeable co-change pattern |
| < 0.3 | WEAK | Occasional co-changes |
High coupling between unrelated modules suggests hidden dependencies or architectural issues — consider extracting shared abstractions.
Options:
| Flag | Description |
|---|---|
--json |
Output as JSON |
--top N |
Show only the top N file pairs (default: 20) |
--sort-by METRIC |
Sort by strength or shared (default: strength) |
--since DURATION |
Only consider commits since this time (e.g. 6m, 1y, 30d) |
--min-degree N |
Minimum commits per file to be included (default: 3) |
--min-strength F |
Minimum coupling strength to show (e.g. 0.5 for strong only) |
Example output:
Temporal Coupling — Files That Change Together
──────────────────────────────────────────────────────────────────────────────────
File A File B Shared Strength Level
──────────────────────────────────────────────────────────────────────────────────
src/auth/jwt.rs src/auth/middleware.rs 12 0.86 STRONG
lib/parser.rs lib/validator.rs 8 0.53 STRONG
config/db.yaml config/cache.yaml 6 0.35 MODERATE
──────────────────────────────────────────────────────────────────────────────────
12 coupled pairs found (3 shown). Showing pairs with >= 3 shared commits.
Strong coupling (>= 0.5) suggests hidden dependencies — consider extracting shared abstractions.
Note: File renames are not tracked across git history. Renamed files appear as separate entries.
km score -- Code health score
Computes an overall code health score for the project, grading it from A++ (exceptional) to F-- (severe issues). Analyzes 6 dimensions of code quality using only static metrics (no git required).
Non-code files (Markdown, TOML, JSON, etc.) are automatically excluded. Inline test blocks (#[cfg(test)]) are excluded from duplication analysis.
Dimensions and weights
| Dimension | Weight | What it measures |
|---|---|---|
| Maintainability Index | 30% | Verifysoft MI, normalized to 0-100 |
| Cyclomatic Complexity | 20% | Max complexity per file |
| Duplication | 15% | Project-wide duplicate code % |
| Indentation Complexity | 15% | Stddev of indentation depth |
| Halstead Effort | 15% | Mental effort per LOC |
| File Size | 5% | Optimal range 50-300 LOC |
Each dimension is aggregated as a LOC-weighted mean across all files (except Duplication which is a single project-level value). The project score is the weighted sum of all dimension scores.
Grade scale
| Grade | Score range | Grade | Score range |
|---|---|---|---|
| A++ | 97-100 | C+ | 73-76 |
| A+ | 93-96 | C | 70-72 |
| A | 90-92 | C- | 67-69 |
| A- | 87-89 | D+ | 63-66 |
| B+ | 83-86 | D | 60-62 |
| B | 80-82 | D- | 57-59 |
| B- | 77-79 | F | 40-56 |
| F-- | 0-39 |
Options:
| Flag | Description |
|---|---|
--json |
Output as JSON |
--include-tests |
Include test files in analysis (excluded by default) |
--bottom N |
Number of worst files to show in "needs attention" (default: 10) |
--min-lines N |
Minimum lines for a duplicate block (default: 6) |
Example output:
Code Health Score
──────────────────────────────────────────────────────────────────
Project Score: B+ (84.3)
Files Analyzed: 42
Total LOC: 8,432
──────────────────────────────────────────────────────────────────
Dimension Weight Score Grade
──────────────────────────────────────────────────────────────────
Maintainability Index 30% 88.2 A-
Cyclomatic Complexity 20% 82.4 B+
Duplication 15% 91.3 A
Indentation Complexity 15% 79.8 B-
Halstead Effort 15% 85.1 B+
File Size 5% 89.2 A-
──────────────────────────────────────────────────────────────────
Files Needing Attention (worst scores)
──────────────────────────────────────────────────────────────────
Score Grade File Issues
──────────────────────────────────────────────────────────────────
54.2 F src/legacy/parser.rs Complexity: 87, MI: 12
63.7 D+ src/utils/helpers.rs MI: 42, Indent: 2.4
68.9 C- src/core/engine.rs Size: 1243 LOC
──────────────────────────────────────────────────────────────────
Features
- Respects
.gitignorerules automatically - Deduplicates files by content hash (identical files counted once)
- Detects languages by file extension, filename, or shebang line
- Supports nested block comments (Rust, Haskell, OCaml, etc.)
- Handles pragmas (e.g., Haskell
{-# LANGUAGE ... #-}) as code - Mixed lines (code + comment) are counted as code, matching
clocbehavior
Supported Languages
| Language | Extensions / Filenames |
|---|---|
| Bourne Again Shell | .bash |
| Bourne Shell | .sh |
| C | .c, .h |
| C# | .cs |
| C++ | .cpp, .cxx, .cc, .hpp, .hxx |
| Clojure | .clj, .cljs, .cljc, .edn |
| CSS | .css |
| Dart | .dart |
| Dockerfile | Dockerfile |
| DOS Batch | .bat, .cmd |
| Elixir | .ex |
| Elixir Script | .exs |
| Erlang | .erl, .hrl |
| F# | .fs, .fsi, .fsx |
| Go | .go |
| Gradle | .gradle |
| Groovy | .groovy |
| Haskell | .hs |
| HTML | .html, .htm |
| Java | .java |
| JavaScript | .js, .mjs, .cjs |
| JSON | .json |
| Julia | .jl |
| Kotlin | .kt, .kts |
| Lua | .lua |
| Makefile | .mk, Makefile, makefile, GNUmakefile |
| Markdown | .md, .markdown |
| Nim | .nim |
| Objective-C | .m, .mm |
| OCaml | .ml, .mli |
| Perl | .pl, .pm |
| PHP | .php |
| Properties | .properties |
| Python | .py, .pyi |
| R | .r, .R |
| Ruby | .rb, Rakefile, Gemfile |
| Rust | .rs |
| Scala | .scala, .sc, .sbt |
| SQL | .sql |
| Swift | .swift |
| Terraform | .tf |
| Text | .txt |
| TOML | .toml |
| TypeScript | .ts, .mts, .cts |
| XML | .xml, .xsl, .xslt, .svg, .fsproj, .csproj, .vbproj, .vcxproj, .sln, .plist, .xaml |
| YAML | .yaml, .yml |
| Zig | .zig |
| Zsh | .zsh |
Development
References
The metrics and methodologies implemented in Kimün are based on the following sources:
Books
- Adam Thornhill, Your Code as a Crime Scene (Pragmatic Bookshelf, 2015). Basis for hotspot analysis (ch. 4–5), temporal coupling (ch. 7), knowledge maps / code ownership (ch. 8–9), and indentation-based complexity as a proxy for code quality.
- Adam Thornhill, Software Design X-Rays (Pragmatic Bookshelf, 2018). Extends the crime scene metaphor with additional behavioral code analysis techniques.
Papers and standards
- Maurice H. Halstead, Elements of Software Science (Elsevier, 1977). Defines the operator/operand metrics: vocabulary, volume, difficulty, effort, estimated bugs, and development time.
- Thomas J. McCabe, "A Complexity Measure", IEEE Transactions on Software Engineering, SE-2(4), December 1976, pp. 308–320. Introduces cyclomatic complexity as a measure of independent paths through a program's control flow graph.
- Paul Oman & Jack Hagemeister, "Metrics for Assessing a Software System's Maintainability", Proceedings of the International Conference on Software Maintenance (ICSM), 1992. Original Maintainability Index formula combining Halstead Volume, cyclomatic complexity, and lines of code.
- Microsoft, Code Metrics — Maintainability Index range and meaning. Visual Studio variant: normalized to 0–100 scale, no comment-weight term.
- Verifysoft, Maintainability Index. Extended MI formula with a comment-weight component (MIcw) that rewards well-commented code.
License
See Cargo.toml for package details.