Codanna
Semantic code search and relationship tracking via MCP and Unix CLI.
How It Works
- Parse - Tree-sitter AST parsing for Rust and Python (JavaScript/TypeScript coming)
- Extract - Symbols, call graphs, implementations, and type relationships
- Embed - 384-dimensional vectors from doc comments via AllMiniLML6V2
- Index - Tantivy for full-text search + memory-mapped symbol cache for <10ms lookups
- Serve - MCP protocol for AI assistants, ~300ms response time
Installation
# Install latest version
# Install with HTTP server (OAuth authentication)
# Install with HTTPS server (TLS + optional OAuth)
# Install from local path (development)
Quick Start
- Initialize and configure:
# Initialize codanna index space and create .codanna/settings.toml
# Enable semantic search in .codanna/settings.toml
- Enable semantic search in
.codanna/settings.toml:
[]
= true
- Index your codebase:
# Index with progress display
# See what would be indexed (dry run)
# Index a specific file
- Search your code:
# Semantic search with new simplified syntax
# Find symbols with JSON output
# Analyze function relationships
|
# Legacy format still works
Claude Integration
MCP Server (Recommended)
Add to your .mcp.json:
HTTP/HTTPS Server
For persistent server with real-time file watching:
# HTTP server with OAuth authentication (requires http-server feature)
# HTTPS server with TLS encryption (requires https-server feature)
Configure in .mcp.json:
For HTTPS configuration, see the HTTPS Server Mode documentation.
Claude Sub Agent
We include a codanna-navigator sub agent at .claude/agents/codanna-navigator.md. This agent is optimized for using the codanna MCP server.
Unix-Style Integration
Codanna CLI is unix-friendly with positional arguments and JSON output for easy command chaining:
# New simplified syntax - positional arguments for simple tools
# Key:value pairs for complex tools
# Powerful Unix piping with JSON output
| |
| | \
# Legacy format still supported for backward compatibility
All MCP tools support --json flag for structured output, making integration with other tools seamless.
Configuration
Configure Codanna in .codanna/settings.toml:
[]
= true
= "AllMiniLML6V2"
= 0.6 # Similarity threshold (0-1)
[]
= 16 # Auto-detected by default
= true # Index test files
Codanna respects .gitignore and adds its own .codannaignore:
# Created automatically by codanna init
Documentation Comments for Better Search
Semantic search works by understanding your documentation comments:
/// Parse configuration from a TOML file and validate required fields
/// This handles missing files gracefully and provides helpful error messages
With good comments, semantic search can find this function when prompted for:
- "configuration validation"
- "handle missing config files"
- "TOML parsing with error handling"
This encourages better documentation → better AI understanding → more motivation to document.
CLI Commands
Core Commands
| Command | Description | Example |
|---|---|---|
codanna init |
Set up .codanna directory with default configuration | codanna init --force |
codanna index <PATH> |
Build searchable index from your codebase | codanna index src --progress |
codanna config |
Display active settings | codanna config |
codanna serve |
Start MCP server for AI assistants | codanna serve --watch |
Retrieval Commands
All retrieve commands support --json flag for structured output (exit code 3 when not found).
| Command | Description | Example |
|---|---|---|
retrieve symbol <NAME> |
Find a symbol by name | codanna retrieve symbol main --json |
retrieve calls <FUNCTION> |
Show what functions a given function calls | codanna retrieve calls parse_file --json |
retrieve callers <FUNCTION> |
Show what functions call a given function | codanna retrieve callers main --json |
retrieve implementations <TRAIT> |
Show what types implement a trait | codanna retrieve implementations Parser --json |
retrieve impact <SYMBOL> |
Show the impact radius of changing a symbol | codanna retrieve impact main --depth 3 --json |
retrieve search <QUERY> |
Search for symbols using full-text search | codanna retrieve search "parse" --limit 5 --json |
retrieve describe <SYMBOL> |
Show comprehensive information about a symbol | codanna retrieve describe SimpleIndexer --json |
Testing and Utilities
| Command | Description | Example |
|---|---|---|
codanna mcp-test |
Verify Claude can connect and list available tools | codanna mcp-test |
codanna mcp <TOOL> |
Execute MCP tools without spawning server | codanna mcp find_symbol main --json |
codanna benchmark |
Benchmark parser performance | codanna benchmark rust --file my_code.rs |
Common Flags
--config,-c: Path to custom settings.toml file--force,-f: Force operation (overwrite, re-index, etc.)--progress,-p: Show progress during operations--threads,-t: Number of threads to use--dry-run: Show what would happen without executing
MCP Tools
Available tools when using the MCP server. All tools support --json flag for structured output.
Simple Tools (Positional Arguments)
| Tool | Description | Example |
|---|---|---|
find_symbol |
Find a symbol by exact name | codanna mcp find_symbol main --json |
get_calls |
Show functions called by a given function | codanna mcp get_calls process_file |
find_callers |
Show functions that call a given function | codanna mcp find_callers init |
analyze_impact |
Analyze the impact radius of symbol changes | codanna mcp analyze_impact Parser --json |
get_index_info |
Get index statistics and metadata | codanna mcp get_index_info --json |
Complex Tools (Key:Value Arguments)
| Tool | Description | Example |
|---|---|---|
search_symbols |
Search symbols with full-text fuzzy matching | codanna mcp search_symbols query:parse kind:function limit:10 |
semantic_search_docs |
Search using natural language queries | codanna mcp semantic_search_docs query:"error handling" limit:5 |
semantic_search_with_context |
Search with enhanced context | codanna mcp semantic_search_with_context query:"parse files" threshold:0.7 |
Parameters Reference
| Tool | Parameters |
|---|---|
find_symbol |
name (required) |
search_symbols |
query, limit, kind, module |
semantic_search_docs |
query, limit, threshold |
semantic_search_with_context |
query, limit, threshold |
get_calls |
function_name |
find_callers |
function_name |
analyze_impact |
symbol_name, max_depth |
get_index_info |
None |
Performance
Parser benchmarks on a 750-symbol test file:
| Language | Parsing Speed | vs. Target (10k/s) | Status |
|---|---|---|---|
| Rust | 91,318 symbols/sec | 9.1x faster ✅ | Production |
| Python | 75,047 symbols/sec | 7.5x faster ✅ | Production |
| JavaScript | - | - | Coming soon |
| TypeScript | - | - | Coming soon |
Key achievements:
- Zero-cost abstractions: All parsers use borrowed string slices with no allocations in hot paths
- Parallel processing: Multi-threaded indexing that scales with CPU cores
- Memory efficiency: Approximately 100 bytes per symbol including all metadata
- Real-time capability: Fast enough for incremental parsing during editing
- Optimized CLI startup: ~300ms for all operations (53x improvement from v0.2)
- JSON output: Zero overhead - structured output adds <1ms to response time
Run performance benchmarks:
Architecture Highlights
Memory-mapped storage: Two caches for different access patterns:
symbol_cache.bin- FNV-1a hashed symbol lookups, <10ms response timesegment_0.vec- 384-dimensional vectors, <1μs access after OS page cache warm-up
Embedding lifecycle management: Old embeddings deleted when files are re-indexed to prevent accumulation.
Lock-free concurrency: DashMap for concurrent symbol reads, write coordination via single writer lock.
Single-pass indexing: Symbols, relationships, and embeddings extracted in one AST traversal.
Hot reload: File watcher with 500ms debounce triggers re-indexing of changed files only.
Requirements
- Rust 1.75+ (for development)
- ~150MB for model storage (downloaded on first use)
- A few MB for index storage (varies by codebase size)
Current Limitations
- Supports Rust and Python (JavaScript, TypeScript coming soon)
- Semantic search requires English documentation/comments
- Windows support is experimental
Roadmap
Versioning Strategy
- 0.2.x - Patches and fixes only (bug fixes, dependency updates, performance improvements)
- 0.3.x - Feature releases (JSON output, exit codes, new capabilities)
- 0.4.x - Major features (JavaScript/TypeScript support, advanced analysis)
Status Overview
| Priority | Feature | Status | Target |
|---|---|---|---|
| 1 | JSON Output Support | ✅ Completed | v0.3.0 |
| 2 | Exit Codes for Common Conditions | ✅ Completed | v0.3.0 |
| 3 | Batch Symbol Operations | Planning | v0.3.1 |
| 4 | Output Format Control | Planning | v0.3.1 |
| 5 | Direct CLI Semantic Search | Partial | v0.3.1 |
| 6 | Incremental Index Updates | ✅ Completed | v0.2.0 |
| 7 | Query Language for Complex Searches | Partial | -- |
| 8 | Configuration Profiles | Pending | -- |
| 9 | Machine-Readable Progress | Pending | -- |
1. Direct CLI Semantic Search
Partially Implemented: Simplified syntax available through MCP interface.
# NEW: Simplified syntax (no JSON escaping needed!)
# Still TODO: Direct retrieve command
Delivered:
- ✅ Simpler command syntax (key:value pairs)
- ✅ Better Unix integration (positional args)
- ✅ No JSON escaping needed
Remaining: Direct retrieve semantic command for consistency
2. JSON Output Support
Implemented in v0.3.0: All retrieve commands and MCP tools now support --json flag.
# All retrieve commands support --json
# All MCP tools support --json
Delivered Benefits:
- ✅ Stable API for scripts and tools
- ✅ Zero performance overhead (<1ms)
- ✅ Consistent JsonResponse format across all commands
- ✅ Proper exit codes (3 for not found)
3. Batch Symbol Operations
Why: Reduce overhead when analyzing multiple symbols
# Current: Multiple invocations
for; do
done
# Wishlist: Single command
Benefits:
- One index load instead of N
- Faster CI/CD pipelines
- Better for parallel analysis
4. Output Format Control
Why: Different use cases need different detail levels
# Compact output for scripts
# Full output for humans (current default)
5. Exit Codes for Common Conditions
Implemented in v0.3.0: All commands now return appropriate exit codes.
# Exit codes implemented:
# 0 - Success
# 1 - General error
# 3 - Not found (symbol, function, etc.)
if ; then
else
if [; then
else
fi
fi
Actual JSON output:
{
}
# Exit code: 3
6. Query Language for Complex Searches
Partially Implemented: Key:value syntax available for MCP tools.
# NOW AVAILABLE: Key:value syntax for MCP tools
# Still TODO: Advanced query combinations
Delivered: Basic key:value parameter parsing for MCP tools Remaining: Full query language with wildcards and combinations
7. Incremental Index Updates
Implemented: Watch mode with notification channels for coordinated updates.
# Watch mode auto-indexes changed files
# Server output shows notification flow:
# Detected change in indexed file: src/main.rs
# Re-indexing...
# ✓ Re-indexed successfully (file updated)
# File watcher received IndexReloaded notification
# Refreshing watched file list...
# ✓ Now watching 60 files
Delivered:
- ✅ Automatic file watching with
--watchflag - ✅ Broadcast channels coordinate index and file watchers
- ✅ File deletions trigger index and cache cleanup
- ✅ Only changed files are re-indexed
- ✅ Event-driven with debouncing for efficiency
8. Configuration Profiles
Why: Different settings for different use cases
# .codanna/profiles.toml
# Use profile
9. Machine-Readable Progress
Why: Better CI/CD integration
# Current: Human-readable progress
# Wishlist: Machine-readable option
}
}
Implementation Priority
- JSON output - Enables everything else
- Exit codes - Minimal change, big impact
- Batch operations - Performance win
- Format control - Flexibility for users
- Rest - Nice to have
Contributing
This is an early release focused on core functionality. Contributions welcome! See CONTRIBUTING.md for guidelines.
License
Licensed under the Apache License, Version 2.0 - See LICENSE file for details.
Attribution required when using Codanna in your project. See NOTICE file.
Built with 🦀 by developers who wanted their AI assistants to actually understand their code.