code-analyze-mcp 0.1.7

MCP server for code structure analysis using tree-sitter
Documentation

code-analyze-mcp

MCP Security Scan License Rust MCP crates.io

Standalone MCP server for code structure analysis using tree-sitter.

[!NOTE] Native agent tools (regex search, path matching, file reading) handle targeted lookups well. code-analyze-mcp handles the mechanical, non-AI work: mapping directory structure, extracting symbols, and tracing call graphs. Offloading this to a dedicated tool reduces token usage and speeds up coding with better accuracy.

Benchmarks

Benchmarked on a Django auth migration task (4 conditions, 8 scored runs) on Claude Code against the Django (Python) source tree. Full methodology.

Mode Sonnet 4.6 Haiku 4.5
MCP 150k tokens / 1.2m 403k tokens / 0.8m
Native 284k tokens / 2.3m 478k tokens / 1.3m
Savings 47% fewer tokens, 2x faster 16% fewer tokens, 40% faster

Overview

code-analyze-mcp is a Model Context Protocol server that gives AI agents precise structural context about a codebase: directory trees, symbol definitions, and call graphs, without reading raw files. It supports Rust, Python, Go, Java, TypeScript, TSX, and Fortran, and integrates with any MCP-compatible orchestrator (Claude Code, Kiro, Fast-Agent, MCP-Agent, and others).

Installation

Homebrew (macOS and Linux)

brew install clouatre-labs/tap/code-analyze-mcp

Update: brew upgrade code-analyze-mcp

cargo-binstall (no Rust required)

cargo binstall code-analyze-mcp

cargo install (requires Rust toolchain)

cargo install code-analyze-mcp

Quick Start

Build from source

cargo build --release

The binary is at target/release/code-analyze-mcp.

Configure MCP Client

After installation via brew or cargo, register with the Claude Code CLI:

claude mcp add --transport stdio code-analyze -- code-analyze-mcp

If you built from source, use the binary path directly:

claude mcp add --transport stdio code-analyze -- /path/to/repo/target/release/code-analyze-mcp

stdio is intentional: this server runs locally and processes files directly on disk. The low-latency, zero-network-overhead transport matches the use case. Streamable HTTP adds a network hop with no benefit for a local tool.

Or add manually to .mcp.json at your project root (shared with your team via version control):

{
  "mcpServers": {
    "code-analyze": {
      "command": "code-analyze-mcp",
      "args": []
    }
  }
}

Tools

All optional parameters may be omitted. Shared optional parameters for analyze_directory, analyze_file, and analyze_symbol (analyze_module does not support these):

Parameter Type Default Description
summary boolean auto Compact output; auto-triggers above 50K chars
cursor string -- Pagination cursor from a previous response's next_cursor
page_size integer 100 Items per page
force boolean false Bypass output size warning
verbose boolean false true = full output with section headers and imports (Markdown-style headers in analyze_directory; adds I: section in analyze_file); false = compact format

summary=true and cursor are mutually exclusive. Passing both returns an error.

analyze_directory

Walks a directory tree, counts lines of code, functions, and classes per file. Respects .gitignore rules. Default output is a flat PAGINATED list. Pass verbose=true for FILES / TEST FILES section headers. Pass summary=true for a compact STRUCTURE tree with aggregate counts.

Required: path (string) -- directory to analyze

Additional optional: max_depth (integer, default unlimited) -- recursion limit; use 2-3 for large monorepos

Example output (default):

PAGINATED: showing 16 of 16 files (max_depth=1)

analyze.rs [737L, 13F, 4C]
cache.rs [105L, 5F, 2C]
completion.rs [129L, 2F]
formatter.rs [1876L, 32F, 2C]
graph.rs [926L, 34F, 3C]
lang.rs [41L, 3F]
lib.rs [1335L, 22F, 1C]
logging.rs [136L, 11F, 3C]
main.rs [50L, 1F]
metrics.rs [254L, 13F, 3C]
pagination.rs [198L, 11F, 4C]
parser.rs [990L, 19F, 4C]
schema_helpers.rs [56L, 4F]
traversal.rs [90L, 1F, 2C]
types.rs [575L, 8F, 27C]

test_detection.rs [100L, 5F]

Example output (verbose=true):

PAGINATED: showing 16 of 16 files (max_depth=1)

FILES [LOC, FUNCTIONS, CLASSES]
analyze.rs [737L, 13F, 4C]
cache.rs [105L, 5F, 2C]
completion.rs [129L, 2F]
formatter.rs [1876L, 32F, 2C]
graph.rs [926L, 34F, 3C]
lang.rs [41L, 3F]
lib.rs [1335L, 22F, 1C]
logging.rs [136L, 11F, 3C]
main.rs [50L, 1F]
metrics.rs [254L, 13F, 3C]
pagination.rs [198L, 11F, 4C]
parser.rs [990L, 19F, 4C]
schema_helpers.rs [56L, 4F]
traversal.rs [90L, 1F, 2C]
types.rs [575L, 8F, 27C]

TEST FILES [LOC, FUNCTIONS, CLASSES]
test_detection.rs [100L, 5F]

Example output (summary=true):

SUMMARY:
16 files (15 prod, 1 test), 7598L, 184F, 55C (max_depth=1)
Languages: rust (100%)

STRUCTURE (depth 1):
  analyze.rs [737L, 13F, 4C]
  cache.rs [105L, 5F, 2C]
  completion.rs [129L, 2F]
  formatter.rs [1876L, 32F, 2C]
  graph.rs [926L, 34F, 3C]
  lang.rs [41L, 3F]
  languages/
  lib.rs [1335L, 22F, 1C]
  logging.rs [136L, 11F, 3C]
  main.rs [50L, 1F]
  metrics.rs [254L, 13F, 3C]
  pagination.rs [198L, 11F, 4C]
  parser.rs [990L, 19F, 4C]
  schema_helpers.rs [56L, 4F]
  test_detection.rs [100L, 5F]
  traversal.rs [90L, 1F, 2C]
  types.rs [575L, 8F, 27C]

SUGGESTION:
Use a narrower path for details (e.g., analyze src/core/)
analyze_directory path: /path/to/project
analyze_directory path: /path/to/project max_depth: 2
analyze_directory path: /path/to/project summary: true
analyze_directory path: /path/to/project verbose: true

analyze_file

Extracts functions, classes, and imports from a single file.

Required: path (string) -- file to analyze

Additional optional: ast_recursion_limit (integer, default 256) -- tree-sitter recursion cap for stack safety

Example output (default, page 1 of 2):

FILE: src/lib.rs (1335L, 1-10/22F, 1C, 66I)
C:
  CodeAnalyzer:143
F:
  summary_cursor_conflict:65, error_meta:69, err_to_tool_result:81,
  no_cache_meta:85, paginate_focus_chains:96, list_tools:154, new:158,
  emit_progress:175, handle_overview_mode:202, handle_file_details_mode:303

NEXT_CURSOR: eyJtb2RlIjoiZGVmYXVsdCIsIm9mZnNldCI6MTB9

Example output (verbose=true, adds I: section before F:):

FILE: src/lib.rs (1335L, 1-10/22F, 1C, 66I)
C:
  CodeAnalyzer:143
I:
  cache(1)
  crate::pagination(2)
  crate::types(3)
  formatter(6)
  pagination(6)
  rmcp(6)
  rmcp::model(19)
  types(5)
F:
  summary_cursor_conflict:65, error_meta:69, err_to_tool_result:81,
  no_cache_meta:85, paginate_focus_chains:96, list_tools:154, new:158,
  emit_progress:175, handle_overview_mode:202, handle_file_details_mode:303

NEXT_CURSOR: eyJtb2RlIjoiZGVmYXVsdCIsIm9mZnNldCI6MTB9
analyze_file path: /path/to/file.rs
analyze_file path: /path/to/file.rs page_size: 50
analyze_file path: /path/to/file.rs cursor: eyJvZmZzZXQiOjUwfQ==

analyze_module

Extracts a minimal function/import index from a single file. ~75% smaller output than analyze_file. Use when you need function names and line numbers or the import list, without signatures, types, or call graphs. Returns an actionable error if called on a directory path, steering to analyze_directory.

Required: path (string) -- file to analyze

Example output:

FILE: lib.rs (1335L, 22F, 66I)
F:
  summary_cursor_conflict:65, error_meta:69, err_to_tool_result:81,
  no_cache_meta:85, paginate_focus_chains:96, list_tools:154, new:158,
  emit_progress:175, handle_overview_mode:202, handle_file_details_mode:303,
  handle_focused_mode:344, analyze_directory:553, analyze_file:691,
  analyze_symbol:846, analyze_module:998, get_info:1079, on_initialized:1106,
  on_cancelled:1154, complete:1167, set_level:1221
I:
  cache:AnalysisCache; formatter:format_structure_paginated;
  pagination:paginate_slice; rmcp::model:CallToolResult;
  types:AnalyzeDirectoryParams
analyze_module path: /path/to/file.rs

analyze_symbol

Builds a call graph for a named symbol across all files in a directory. Uses sentinel values <module> (top-level calls) and <reference> (type references). Functions called >3 times show (•N) notation.

Required:

  • path (string) -- directory to search
  • symbol (string) -- symbol name, case-sensitive exact-match

Additional optional:

  • follow_depth (integer, default 1) -- call graph traversal depth
  • max_depth (integer, default unlimited) -- directory recursion limit
  • ast_recursion_limit (integer, default 256) -- tree-sitter recursion cap for stack safety
  • match_mode (string, default exact) -- Symbol lookup strategy:
    • exact: Case-sensitive exact match (default)
    • insensitive: Case-insensitive exact match
    • prefix: Case-insensitive prefix match; returns an error listing candidates when multiple symbols match
    • contains: Case-insensitive substring match; returns an error listing candidates when multiple symbols match All non-exact modes return an error with candidate names when the match is ambiguous; use the listed candidates to refine to a unique match.

Example output:

FOCUS: format_structure_paginated (1 defs, 1 callers, 3 callees)
CALLERS (1-1 of 1):
  format_structure_paginated <- analyze_directory
    <- format_structure_paginated
CALLEES: 3 (use cursor for callee pagination)
analyze_symbol path: /path/to/project symbol: my_function
analyze_symbol path: /path/to/project symbol: my_function follow_depth: 3
analyze_symbol path: /path/to/project symbol: my_function max_depth: 3 follow_depth: 2

Output Management

For large codebases, two mechanisms prevent context overflow:

Pagination

analyze_file and analyze_symbol append a NEXT_CURSOR: line when output is truncated. Pass the token back as cursor to fetch the next page. summary=true and cursor are mutually exclusive; passing both returns an error.

# Response ends with:
NEXT_CURSOR: eyJvZmZzZXQiOjUwfQ==

# Fetch next page:
analyze_symbol path: /my/project symbol: my_function cursor: eyJvZmZzZXQiOjUwfQ==

Summary Mode

When output exceeds 50K chars, the server auto-compacts results using aggregate statistics. Override with summary: true (force compact) or summary: false (disable).

# Force summary for large project
analyze_directory path: /huge/codebase summary: true

# Disable summary (get full details, may be large)
analyze_directory path: /project summary: false

Non-Interactive Pipelines

In single-pass subagent sessions, prompt caches are written but never reused. Benchmarks showed MCP responses writing ~2x more to cache than native-only workflows, adding cost with no quality gain. Set DISABLE_PROMPT_CACHING=1 (or DISABLE_PROMPT_CACHING_HAIKU=1 for Haiku-specific pipelines) to avoid this overhead.

The server's own instructions expose a 4-step recommended workflow for unknown repositories: survey the repo root with analyze_directory at max_depth=2, drill into the source package, run analyze_file on key files, then use analyze_symbol to trace call graphs. MCP clients that surface server instructions will present this workflow automatically to the agent.

Observability

All four tools emit metrics to daily-rotated JSONL files at $XDG_DATA_HOME/code-analyze-mcp/ (fallback: ~/.local/share/code-analyze-mcp/). Each record captures tool name, duration, output size, and result status. Files are retained for 30 days. See docs/OBSERVABILITY.md for the full schema.

Supported Languages

Language Extensions Status
Rust .rs Implemented
Python .py Implemented
TypeScript .ts, .tsx Implemented
Go .go Implemented
Java .java Implemented
Fortran .f, .f77, .f90, .f95, .f03, .f08, .for, .ftn Implemented

Documentation

  • ARCHITECTURE.md - Design goals, module map, data flow, language handler system, caching strategy
  • MCP, Agents, and Orchestration - Best practices for agentic loops, orchestration patterns, MCP tool design, memory management, and safety controls
  • OBSERVABILITY.md - Metrics schema, JSONL format, and retention policy
  • ROADMAP.md - Development history and future direction
  • DESIGN-GUIDE.md - Design decisions, rationale, and replication guide for building high-performance MCP servers
  • CONTRIBUTING.md - Development workflow, commit conventions, PR checklist
  • SECURITY.md - Security policy and vulnerability reporting

License

Apache-2.0. See LICENSE for details.