indxr-0.1.0 is not a library.

indxr

Fast codebase indexer for AI agents. Tree-sitter AST parsing + regex extraction across 27 languages. Built in Rust.

Install

cargo install --path .

Usage

indxr                                        # index cwd → stdout
indxr ./my-project -o INDEX.md               # index project → file
indxr -f json -l rust,python -o index.json   # JSON, filter by language
indxr serve ./my-project                     # start MCP server

Output

Default format is Markdown at signatures detail level:

# Codebase Index: my-project

> Generated: 2025-03-23 | Files: 42 | Lines: 8,234
> Languages: Rust (28), Python (10), TypeScript (4)

## Directory Structure
src/
  main.rs
  parser/
    mod.rs
    rust.rs

## Public API Surface

**src/main.rs**
- `pub fn main() -> Result<()>`
- `pub struct App`

---

## src/main.rs

**Language:** Rust | **Size:** 1.2 KB | **Lines:** 45

**Imports:**
- `use anyhow::Result`
- `use clap::Parser`

**Declarations:**

`pub fn main() -> Result<()>`

`pub struct App`
> Fields: `name: String`, `config: Config`

Three output formats (-f): markdown (default), json, yaml.

Three detail levels (-d):

Level	Content
`summary`	Directory tree + file list
`signatures` (default)	+ declarations, imports
`full`	+ doc comments, line numbers, body line counts, metadata badges, relationships

Languages

8 tree-sitter (full AST) + 19 regex (structural extraction):

Parser	Languages
tree-sitter	Rust, Python, TypeScript/TSX, JavaScript/JSX, Go, Java, C, C++
regex	Shell, TOML, YAML, JSON, SQL, Markdown, Protobuf, GraphQL, Ruby, Kotlin, Swift, C#, Objective-C, XML, HTML, CSS, Gradle, CMake, Properties

Detection is by file extension. Full extraction details: docs/languages.md

Filtering

indxr --filter-path src/parser              # subtree
indxr --kind function --public-only         # public functions only
indxr --symbol "parse"                      # symbol name search (case-insensitive substring)
indxr --filter-path src/model --kind struct --public-only  # combine
indxr -l rust,python                        # language filter

All filters compose. --kind accepts: function, struct, class, trait, enum, interface, module, method, constant, impl, type, namespace, macro, table, service, message, rpc, and more.

Git Structural Diffing

Declaration-level diffs against any git ref:

indxr --since main
indxr --since v1.0.0
indxr --since HEAD~5

## Modified Files

### src/parser/mod.rs
+ `pub fn new_parser() -> Parser`
- `fn old_helper()`
~ `fn process(x: i32)` → `fn process(x: i32, y: i32)`

Markers: + added, - removed, ~ signature changed. Supports --filter-path, -l, --public-only, -f json.

Token Budget

Progressive truncation to fit context windows:

indxr --max-tokens 4000

Truncation order: doc comments → private declarations → children → least-important files. Directory tree and public API surface are preserved first.

File importance scoring: entry points (main.rs, lib.rs, index.ts) > root proximity > public declaration count.

MCP Server

JSON-RPC 2.0 over stdin/stdout, MCP spec 2024-11-05:

indxr serve ./my-project

Tool	Description
`lookup_symbol`	Find declarations by name (case-insensitive substring, default limit 50, max 200)
`list_declarations`	List declarations in a file, optional `kind` filter and `shallow` mode
`search_signatures`	Search signatures by substring (default limit 20, max 100)
`get_tree`	Directory/file tree, optional path prefix filter
`get_imports`	Import statements for a file
`get_stats`	File count, line count, language breakdown, duration
`get_file_summary`	Complete file overview: metadata, declarations, kind counts, public symbols
`read_source`	Read source code by symbol name or line range
`get_file_context`	File summary + reverse dependencies + related files
`regenerate_index`	Re-index codebase and write updated INDEX.md

MCP config:

{
  "mcpServers": {
    "indxr": {
      "command": "indxr",
      "args": ["serve", "/path/to/project"]
    }
  }
}

Setup guides: docs/mcp-server.md

Caching

Incremental binary cache in .indxr-cache/cache.bin. Two-tier validation: mtime + file size (fast path), xxh3 content hash (fallback). Cache format is versioned — automatically rebuilt on indxr upgrades.

indxr --no-cache          # bypass cache
indxr --cache-dir /tmp/c  # custom location

Performance

Parallel parsing via rayon. Incremental caching via mtime + xxh3.

Codebase	Files	Lines	Cold	Cached
Small (indxr)	23	4.6K	17ms	5ms
Medium (atuin)	132	22K	20ms	6ms
Large (cloud-hypervisor)	243	124K	73ms	~10ms

Architecture

Walk directory tree (.gitignore-aware, via ignore crate)
Detect language by file extension
Check cache — skip unchanged files (mtime + xxh3)
Parse with tree-sitter or regex (parallel via rayon)
Extract declarations, metadata, relationships
Apply filters (path, kind, visibility, symbol)
Apply token budget (progressive truncation)
Format as Markdown, JSON, or YAML
Update cache

Documentation

Document	Description
CLI Reference	Complete flag and option reference
Languages	Per-language extraction details
Output Formats	Format and detail level reference
Filtering	Path, kind, symbol, visibility filters
Git Diffing	Structural diff since any git ref
Token Budget	Truncation strategy and scoring
Caching	Cache format and invalidation
MCP Server	MCP tools, protocol, and client setup
Agent Integration	Usage with Claude, Codex, Cursor, Copilot, etc.

License

MIT

indxr 0.1.0