Codanna

Semantic code search and relationship tracking via MCP and Unix CLI.

How It Works

Parse - Tree-sitter AST parsing for Rust and Python (JavaScript/TypeScript coming)
Extract - Symbols, call graphs, implementations, and type relationships
Embed - 384-dimensional vectors from doc comments via AllMiniLML6V2
Index - Tantivy for full-text search + memory-mapped symbol cache for <10ms lookups
Serve - MCP protocol for AI assistants, ~300ms response time

Installation

# Install latest version
cargo install codanna

# Install with HTTP server (OAuth authentication)
cargo install codanna --features http-server

# Install with HTTPS server (TLS + optional OAuth)
cargo install codanna --features https-server

# Install from local path (development)
cargo install --path . --all-features

Quick Start

Initialize and configure:

# Initialize codanna index space and create .codanna/settings.toml
codanna init

# Enable semantic search in .codanna/settings.toml

Enable semantic search in .codanna/settings.toml:

[semantic_search]
enabled = true

Index your codebase:

# Index with progress display
codanna index src --progress

# See what would be indexed (dry run)
codanna index . --dry-run

# Index a specific file
codanna index src/main.rs

Search your code:

# Semantic search with new simplified syntax
codanna mcp semantic_search_docs query:"parse rust files" limit:3 --json

# Find symbols with JSON output
codanna retrieve symbol Parser --json

# Analyze function relationships
codanna mcp find_callers process_file --json | jq '.data[].name'

# Legacy format still works
codanna mcp semantic_search_with_context --args '{"query": "parse rust files and extract symbols", "limit": 3}'

Claude Integration

MCP Server (Recommended)

Add to your .mcp.json:

{
  "mcpServers": {
    "codanna": {
      "command": "codanna",
      "args": ["serve", "--watch", "--watch-interval", "5"]
    }
  }
}

HTTP/HTTPS Server

For persistent server with real-time file watching:

# HTTP server with OAuth authentication (requires http-server feature)
codanna serve --http --watch

# HTTPS server with TLS encryption (requires https-server feature)
codanna serve --https --watch

Configure in .mcp.json:

{
  "mcpServers": {
    "codanna-sse": {
      "type": "sse",
      "url": "http://127.0.0.1:8080/mcp/sse"
    }
  }
}

For HTTPS configuration, see the HTTPS Server Mode documentation.

Claude Sub Agent

We include a codanna-navigator sub agent at .claude/agents/codanna-navigator.md. This agent is optimized for using the codanna MCP server.

Unix-Style Integration

Codanna CLI is unix-friendly with positional arguments and JSON output for easy command chaining:

# New simplified syntax - positional arguments for simple tools
codanna mcp find_symbol main --json
codanna mcp get_calls process_file
codanna mcp find_callers init

# Key:value pairs for complex tools  
codanna mcp semantic_search_docs query:"error handling" limit:3 --json
codanna mcp search_symbols query:parse kind:function --json

# Powerful Unix piping with JSON output
echo "error handling" | codanna mcp semantic_search_docs --json | jq '.data[].name'
codanna mcp find_symbol Parser --json | jq -r '.data[].callers[].name' | \
  xargs -I {} codanna mcp find_symbol {} --json

# Legacy format still supported for backward compatibility
codanna mcp find_symbol --args '{"name": "main"}'

All MCP tools support --json flag for structured output, making integration with other tools seamless.

Configuration

Configure Codanna in .codanna/settings.toml:

[semantic_search]
enabled = true
model = "AllMiniLML6V2"
threshold = 0.6  # Similarity threshold (0-1)

[indexing]
parallel_threads = 16  # Auto-detected by default
include_tests = true   # Index test files

Codanna respects .gitignore and adds its own .codannaignore:

# Created automatically by codanna init
.codanna/       # Don't index own data
target/         # Skip build artifacts
node_modules/   # Skip dependencies
*_test.rs       # Optionally skip tests

Documentation Comments for Better Search

Semantic search works by understanding your documentation comments:

/// Parse configuration from a TOML file and validate required fields
/// This handles missing files gracefully and provides helpful error messages
fn load_config(path: &Path) -> Result<Config, Error> {
    // implementation...
}

With good comments, semantic search can find this function when prompted for:

"configuration validation"
"handle missing config files"
"TOML parsing with error handling"

This encourages better documentation → better AI understanding → more motivation to document.

CLI Commands

Core Commands

Command	Description	Example
`codanna init`	Set up .codanna directory with default configuration	`codanna init --force`
`codanna index <PATH>`	Build searchable index from your codebase	`codanna index src --progress`
`codanna config`	Display active settings	`codanna config`
`codanna serve`	Start MCP server for AI assistants	`codanna serve --watch`

Retrieval Commands

All retrieve commands support --json flag for structured output (exit code 3 when not found).

Command	Description	Example
`retrieve symbol <NAME>`	Find a symbol by name	`codanna retrieve symbol main --json`
`retrieve calls <FUNCTION>`	Show what functions a given function calls	`codanna retrieve calls parse_file --json`
`retrieve callers <FUNCTION>`	Show what functions call a given function	`codanna retrieve callers main --json`
`retrieve implementations <TRAIT>`	Show what types implement a trait	`codanna retrieve implementations Parser --json`
`retrieve impact <SYMBOL>`	Show the impact radius of changing a symbol	`codanna retrieve impact main --depth 3 --json`
`retrieve search <QUERY>`	Search for symbols using full-text search	`codanna retrieve search "parse" --limit 5 --json`
`retrieve describe <SYMBOL>`	Show comprehensive information about a symbol	`codanna retrieve describe SimpleIndexer --json`

Testing and Utilities

Command	Description	Example
`codanna mcp-test`	Verify Claude can connect and list available tools	`codanna mcp-test`
`codanna mcp <TOOL>`	Execute MCP tools without spawning server	`codanna mcp find_symbol main --json`
`codanna benchmark`	Benchmark parser performance	`codanna benchmark rust --file my_code.rs`

Common Flags

--config, -c: Path to custom settings.toml file
--force, -f: Force operation (overwrite, re-index, etc.)
--progress, -p: Show progress during operations
--threads, -t: Number of threads to use
--dry-run: Show what would happen without executing

MCP Tools

Available tools when using the MCP server. All tools support --json flag for structured output.

Simple Tools (Positional Arguments)

Tool	Description	Example
`find_symbol`	Find a symbol by exact name	`codanna mcp find_symbol main --json`
`get_calls`	Show functions called by a given function	`codanna mcp get_calls process_file`
`find_callers`	Show functions that call a given function	`codanna mcp find_callers init`
`analyze_impact`	Analyze the impact radius of symbol changes	`codanna mcp analyze_impact Parser --json`
`get_index_info`	Get index statistics and metadata	`codanna mcp get_index_info --json`

Complex Tools (Key:Value Arguments)

Tool	Description	Example
`search_symbols`	Search symbols with full-text fuzzy matching	`codanna mcp search_symbols query:parse kind:function limit:10`
`semantic_search_docs`	Search using natural language queries	`codanna mcp semantic_search_docs query:"error handling" limit:5`
`semantic_search_with_context`	Search with enhanced context	`codanna mcp semantic_search_with_context query:"parse files" threshold:0.7`

Parameters Reference

Tool	Parameters
`find_symbol`	`name` (required)
`search_symbols`	`query`, `limit`, `kind`, `module`
`semantic_search_docs`	`query`, `limit`, `threshold`
`semantic_search_with_context`	`query`, `limit`, `threshold`
`get_calls`	`function_name`
`find_callers`	`function_name`
`analyze_impact`	`symbol_name`, `max_depth`
`get_index_info`	None

Performance

Parser benchmarks on a 750-symbol test file:

Language	Parsing Speed	vs. Target (10k/s)	Status
Rust	91,318 symbols/sec	9.1x faster ✅	Production
Python	75,047 symbols/sec	7.5x faster ✅	Production
JavaScript	-	-	Coming soon
TypeScript	-	-	Coming soon

Key achievements:

Zero-cost abstractions: All parsers use borrowed string slices with no allocations in hot paths
Parallel processing: Multi-threaded indexing that scales with CPU cores
Memory efficiency: Approximately 100 bytes per symbol including all metadata
Real-time capability: Fast enough for incremental parsing during editing
Optimized CLI startup: ~300ms for all operations (53x improvement from v0.2)
JSON output: Zero overhead - structured output adds <1ms to response time

Run performance benchmarks:

codanna benchmark all          # Test all parsers
codanna benchmark python       # Test specific language

Architecture Highlights

Memory-mapped storage: Two caches for different access patterns:

symbol_cache.bin - FNV-1a hashed symbol lookups, <10ms response time
segment_0.vec - 384-dimensional vectors, <1μs access after OS page cache warm-up

Embedding lifecycle management: Old embeddings deleted when files are re-indexed to prevent accumulation.

Lock-free concurrency: DashMap for concurrent symbol reads, write coordination via single writer lock.

Single-pass indexing: Symbols, relationships, and embeddings extracted in one AST traversal.

Hot reload: File watcher with 500ms debounce triggers re-indexing of changed files only.

Requirements

Rust 1.75+ (for development)
~150MB for model storage (downloaded on first use)
A few MB for index storage (varies by codebase size)

Current Limitations

Supports Rust and Python (JavaScript, TypeScript coming soon)
Semantic search requires English documentation/comments
Windows support is experimental

Roadmap

Versioning Strategy

0.2.x - Patches and fixes only (bug fixes, dependency updates, performance improvements)
0.3.x - Feature releases (JSON output, exit codes, new capabilities)
0.4.x - Major features (JavaScript/TypeScript support, advanced analysis)

Status Overview

Priority	Feature	Status	Target
1	JSON Output Support	✅ Completed	v0.3.0
2	Exit Codes for Common Conditions	✅ Completed	v0.3.0
3	Batch Symbol Operations	Planning	v0.3.1
4	Output Format Control	Planning	v0.3.1
5	Direct CLI Semantic Search	Partial	v0.3.1
6	Incremental Index Updates	✅ Completed	v0.2.0
7	Query Language for Complex Searches	Partial	--
8	Configuration Profiles	Pending	--
9	Machine-Readable Progress	Pending	--

1. Direct CLI Semantic Search

Partially Implemented: Simplified syntax available through MCP interface.

# NEW: Simplified syntax (no JSON escaping needed!)
codanna mcp semantic_search_docs query:authentication limit:10 --json

# Still TODO: Direct retrieve command
codanna retrieve semantic "authentication" --limit 10

Delivered:

✅ Simpler command syntax (key:value pairs)
✅ Better Unix integration (positional args)
✅ No JSON escaping needed

Remaining: Direct retrieve semantic command for consistency

2. JSON Output Support

Implemented in v0.3.0: All retrieve commands and MCP tools now support --json flag.

# All retrieve commands support --json
codanna retrieve symbol MyFunction --json
codanna retrieve calls process_file --json
codanna retrieve callers init --json

# All MCP tools support --json
codanna mcp find_symbol main --json
codanna mcp semantic_search_docs query:"error handling" --json

Delivered Benefits:

✅ Stable API for scripts and tools
✅ Zero performance overhead (<1ms)
✅ Consistent JsonResponse format across all commands
✅ Proper exit codes (3 for not found)

3. Batch Symbol Operations

Why: Reduce overhead when analyzing multiple symbols

# Current: Multiple invocations
for sym in func1 func2 func3; do
  codanna retrieve symbol "$sym"
done

# Wishlist: Single command
codanna retrieve symbols func1 func2 func3

Benefits:

One index load instead of N
Faster CI/CD pipelines
Better for parallel analysis

4. Output Format Control

Why: Different use cases need different detail levels

# Compact output for scripts
codanna retrieve callers MyFunc --format=compact
validate_input:src/validation.rs:45
process_request:src/handler.rs:120

# Full output for humans (current default)
codanna retrieve callers MyFunc --format=full

5. Exit Codes for Common Conditions

Implemented in v0.3.0: All commands now return appropriate exit codes.

# Exit codes implemented:
# 0 - Success
# 1 - General error
# 3 - Not found (symbol, function, etc.)

if codanna retrieve symbol MyFunc --json >/dev/null 2>&1; then
  echo "Symbol exists"
else
  if [ $? -eq 3 ]; then
    echo "Symbol not found"
  else
    echo "Error occurred"
  fi
fi

Actual JSON output:

$ codanna retrieve symbol NonExistent --json
{
  "status": "error",
  "code": "NOT_FOUND",
  "message": "Symbol 'NonExistent' not found",
  "error": {
    "suggestions": [
      "Check the spelling",
      "Ensure the index is up to date"
    ]
  },
  "exit_code": 3
}
# Exit code: 3

6. Query Language for Complex Searches

Partially Implemented: Key:value syntax available for MCP tools.

# NOW AVAILABLE: Key:value syntax for MCP tools
codanna mcp search_symbols query:Parser kind:method limit:20 --json
codanna mcp semantic_search_docs query:"error handling" limit:5 --json

# Still TODO: Advanced query combinations
codanna query "kind:method visibility:public calls:*Parser*"
codanna query "kind:function visibility:private callers:0"

Delivered: Basic key:value parameter parsing for MCP tools Remaining: Full query language with wildcards and combinations

7. Incremental Index Updates

Implemented: Watch mode with notification channels for coordinated updates.

# Watch mode auto-indexes changed files
codanna serve --watch --watch-interval 5

# Server output shows notification flow:
# Detected change in indexed file: src/main.rs
#   Re-indexing...
#   ✓ Re-indexed successfully (file updated)
# File watcher received IndexReloaded notification
#   Refreshing watched file list...
#   ✓ Now watching 60 files