indxr
Fast codebase indexer for AI agents. Tree-sitter AST parsing + regex extraction across 27 languages. Built in Rust.
Install
Usage
Output
Default format is Markdown at signatures detail level:
src/
main.rs
parser/
**src/main.rs**
- -
**Declarations:**
`pub fn main() -> Result<()>`
`pub struct App`
Three output formats (-f): markdown (default), json, yaml.
Three detail levels (-d):
| Level | Content |
|---|---|
summary |
Directory tree + file list |
signatures (default) |
+ declarations, imports |
full |
+ doc comments, line numbers, body line counts, metadata badges, relationships |
Languages
8 tree-sitter (full AST) + 19 regex (structural extraction):
| Parser | Languages |
|---|---|
| tree-sitter | Rust, Python, TypeScript/TSX, JavaScript/JSX, Go, Java, C, C++ |
| regex | Shell, TOML, YAML, JSON, SQL, Markdown, Protobuf, GraphQL, Ruby, Kotlin, Swift, C#, Objective-C, XML, HTML, CSS, Gradle, CMake, Properties |
Detection is by file extension. Full extraction details: docs/languages.md
Filtering
All filters compose. --kind accepts: function, struct, class, trait, enum, interface, module, method, constant, impl, type, namespace, macro, table, service, message, rpc, and more.
Git Structural Diffing
Declaration-level diffs against any git ref:
## Modified Files
### src/parser/mod.rs
+ `pub fn new_parser() -> Parser`
- `fn old_helper()`
~ `fn process(x: i32)` → `fn process(x: i32, y: i32)`
Markers: + added, - removed, ~ signature changed. Supports --filter-path, -l, --public-only, -f json.
Token Budget
Progressive truncation to fit context windows:
Truncation order: doc comments → private declarations → children → least-important files. Directory tree and public API surface are preserved first.
File importance scoring: entry points (main.rs, lib.rs, index.ts) > root proximity > public declaration count.
MCP Server
JSON-RPC 2.0 over stdin/stdout, MCP spec 2024-11-05:
| Tool | Description |
|---|---|
lookup_symbol |
Find declarations by name (case-insensitive substring, default limit 50, max 200) |
list_declarations |
List declarations in a file, optional kind filter and shallow mode |
search_signatures |
Search signatures by substring (default limit 20, max 100) |
get_tree |
Directory/file tree, optional path prefix filter |
get_imports |
Import statements for a file |
get_stats |
File count, line count, language breakdown, duration |
get_file_summary |
Complete file overview: metadata, declarations, kind counts, public symbols |
read_source |
Read source code by symbol name or line range |
get_file_context |
File summary + reverse dependencies + related files |
regenerate_index |
Re-index codebase and write updated INDEX.md |
MCP config:
Setup guides: docs/mcp-server.md
Caching
Incremental binary cache in .indxr-cache/cache.bin. Two-tier validation: mtime + file size (fast path), xxh3 content hash (fallback). Cache format is versioned — automatically rebuilt on indxr upgrades.
Performance
Parallel parsing via rayon. Incremental caching via mtime + xxh3.
| Codebase | Files | Lines | Cold | Cached |
|---|---|---|---|---|
| Small (indxr) | 23 | 4.6K | 17ms | 5ms |
| Medium (atuin) | 132 | 22K | 20ms | 6ms |
| Large (cloud-hypervisor) | 243 | 124K | 73ms | ~10ms |
Architecture
- Walk directory tree (
.gitignore-aware, viaignorecrate) - Detect language by file extension
- Check cache — skip unchanged files (mtime + xxh3)
- Parse with tree-sitter or regex (parallel via rayon)
- Extract declarations, metadata, relationships
- Apply filters (path, kind, visibility, symbol)
- Apply token budget (progressive truncation)
- Format as Markdown, JSON, or YAML
- Update cache
Documentation
| Document | Description |
|---|---|
| CLI Reference | Complete flag and option reference |
| Languages | Per-language extraction details |
| Output Formats | Format and detail level reference |
| Filtering | Path, kind, symbol, visibility filters |
| Git Diffing | Structural diff since any git ref |
| Token Budget | Truncation strategy and scoring |
| Caching | Cache format and invalidation |
| MCP Server | MCP tools, protocol, and client setup |
| Agent Integration | Usage with Claude, Codex, Cursor, Copilot, etc. |
License
MIT