Skip to main content

Crate arborist

Crate arborist 

Source
Expand description

§Arborist

Multi-language code complexity metrics powered by tree-sitter.

Arborist computes cognitive complexity (SonarSource), cyclomatic complexity (McCabe), and source lines of code (SLOC) for functions and methods across 12 programming languages – all from a single, embeddable Rust library.

§Supported Languages

LanguageFeature flagExtensionsDefault
Rustrust.rsYes
Pythonpython.py, .pyiYes
JavaScriptjavascript.js, .jsx, .mjs, .cjsYes
TypeScripttypescript.ts, .tsx, .mts, .ctsYes
Javajava.javaYes
Gogo.goYes
C#csharp.csOpt-in
C++cpp.cpp, .cc, .cxx, .hpp, .hxx, .hhOpt-in
Cc.c, .hOpt-in
PHPphp.phpOpt-in
Kotlinkotlin.kt, .ktsOpt-in
Swiftswift.swiftOpt-in

§Feature Flags

Arborist uses Cargo feature flags to control which tree-sitter grammars are compiled. Each language is an independent, optional feature that pulls in its corresponding tree-sitter grammar crate.

To enable all 12 languages, use features = ["all"].

§Tiers

  • Tier 1 (default): Rust, Python, JavaScript, TypeScript, Java, Go – the most mature tree-sitter grammars. Enabled automatically unless you set default-features = false.
  • Tier 2 (opt-in): C#, C++, C, PHP, Kotlin, Swift – require enabling their feature flag explicitly or using the all composite feature.

§Individual features

Feature flagLanguageTiertree-sitter crate
rustRust1tree-sitter-rust
pythonPython1tree-sitter-python
javascriptJavaScript1tree-sitter-javascript
typescriptTypeScript1tree-sitter-typescript
javaJava1tree-sitter-java
goGo1tree-sitter-go
csharpC#2tree-sitter-c-sharp
cppC++2tree-sitter-cpp
cC2tree-sitter-c
phpPHP2tree-sitter-php
kotlinKotlin2tree-sitter-kotlin-ng
swiftSwift2tree-sitter-swift

§Composite features

FeatureExpands to
defaultrust, python, javascript, typescript, java, go
allAll 12 languages (default + csharp, cpp, c, php, kotlin, swift)

Compile-time note: Each grammar adds compile time and binary size. Use default-features = false with only the languages you need for minimal builds, or all when you need broad language coverage.

§Installation

Add to your Cargo.toml:

# Default features (Tier 1): Rust, Python, JavaScript, TypeScript, Java, Go
[dependencies]
arborist-metrics = "0.1"

Select specific languages to reduce compile time:

[dependencies]
arborist-metrics = { version = "0.1", default-features = false, features = ["rust", "python"] }

Enable all 12 languages:

[dependencies]
arborist-metrics = { version = "0.1", features = ["all"] }

§Quick Start

§Analyze a file

use arborist::{analyze_file, FileReport};

fn main() -> Result<(), arborist::ArboristError> {
    let report: FileReport = analyze_file("src/main.rs")?;

    println!("File: {} ({:?})", report.path, report.language);
    println!("Total cognitive: {}, SLOC: {}", report.file_cognitive, report.file_sloc);

    for func in &report.functions {
        println!("  {} (lines {}-{}): cognitive={}, cyclomatic={}, sloc={}",
            func.name, func.start_line, func.end_line,
            func.cognitive, func.cyclomatic, func.sloc);
    }

    Ok(())
}

§Analyze source code from memory

use arborist::{analyze_source, Language};

let source = r#"
def hello(name):
    if name:
        print(f"Hello, {name}!")
    else:
        print("Hello, world!")
"#;

let report = analyze_source(source, Language::Python)?;
// report.functions[0].cognitive == 2 (if + else)

§Configure thresholds

use arborist::{analyze_file_with_config, AnalysisConfig};

let config = AnalysisConfig {
    cognitive_threshold: Some(8),
    ..Default::default()
};

let report = analyze_file_with_config("src/complex.rs", &config)?;

for func in &report.functions {
    if func.exceeds_threshold == Some(true) {
        eprintln!("WARNING: {} has cognitive complexity {} (threshold: 8)",
            func.name, func.cognitive);
    }
}

§Serialize to JSON

let report = arborist::analyze_file("src/main.rs")?;
let json = serde_json::to_string_pretty(&report)?;
println!("{}", json);

§Metrics

§Cognitive Complexity

Follows the SonarSource specification by G. Ann Campbell. Measures how difficult code is to understand:

  • +1 for each control flow break (if, for, while, match, catch, etc.)
  • Nesting penalty: nested control flow adds the current nesting depth
  • Boolean operator sequences: one increment per operator switch (&& to ||)
  • Flat else if: does not increase nesting
  • +1 for recursive calls and lambda/closure nesting

§Cyclomatic Complexity

Standard McCabe cyclomatic complexity (base 1 + decision points). Measures the number of linearly independent paths through a function.

§SLOC

Physical source lines of code, excluding blank lines and comment-only lines.

§Contributing

  1. Fork the repository
  2. Create a feature branch
  3. Follow TDD: write fixtures and failing tests before implementation
  4. Run cargo clippy -- -D warnings and cargo test --all-features
  5. Submit a pull request

§Adding a new language

  1. Create src/languages/<lang>.rs implementing the LanguageProfile trait
  2. Add the grammar crate as an optional dependency in Cargo.toml
  3. Add a feature flag and wire up detection in src/languages/mod.rs
  4. Create 6 test fixtures in tests/fixtures/<lang>/
  5. Write integration tests

§Built with AI · Powered by StrayMark

Arborist is an experiment in disciplined AI-assisted development. The implementation — tree-sitter integration, the 177-test suite, all 12 language profiles — was authored largely by AI agents under human direction.

To make that velocity sustainable, we use StrayMark: a CLI for cognitive discipline in AI-assisted engineering. It turned every architectural choice into an AIDEC record, every implementation block into an AILOG, and the test plan into a TES — all under .straymark/, append-only and audit-ready. The governance artifacts emerged alongside the code, not as homework after.

StrayMark is built by Strange Days Tech — the same team behind Arborist. It is the tool we made to solve our own problem.

§License

Licensed under either of:

at your option.


Arborist is © 2026 Strange Days Tech S.A.S. de C.V. — original author and intellectual-property holder of the source code.

The library is released under the dual MIT / Apache-2.0 license above; this notice records authorship and does not modify those license terms. Each source file carries an SPDX header reflecting the same.

Built by Strange Days Tech — México.

Re-exports§

pub use error::ArboristError;
pub use languages::LanguageProfile;
pub use types::AnalysisConfig;
pub use types::FileReport;
pub use types::FunctionMetrics;
pub use types::Language;

Modules§

error
languages
Language profiles for tree-sitter grammars.
metrics
types
walker

Functions§

analyze_file
Analyze a source file, auto-detecting language from its extension.
analyze_file_with_config
Analyze a source file with custom configuration.
analyze_source
Analyze source code provided as a string, with explicit language.
analyze_source_with_config
Analyze source code with explicit language and custom configuration.