Please check the build logs for more information.
See Builds for ideas on how to fix a failed build, or Metadata for how to configure docs.rs builds.
If you believe this is docs.rs' fault, open an issue.
LeIndex
Understand large codebases instantly.
LeIndex is a semantic code search engine that lets you search code by meaning, not just keywords.
Instead of hunting through files with grep or hoping variable names match your query, you can ask things like:
- "Where is authentication enforced?"
- "Where are API tokens validated?"
- "How does session management work?"
LeIndex surfaces the actual implementation — even if the words you're searching for never appear in the code.
Built in Rust. Built for developers and AI coding tools.
Demo: finding logic that grep and LLMs miss
Imagine a codebase where authentication is implemented like this:
None of these functions contain the word "authentication".
grep
# (no matches)
LeIndex
src/security/session_validator.rs validate_session (0.92)
src/auth/token_verifier.rs verify_token (0.87)
src/middleware/auth_gate.rs authorize_user (0.84)
LeIndex finds the correct logic because it searches by semantic intent, not string matches.
It works across multiple repositories too:
gateway/middleware/rate_limit.rs throttle_request (0.91)
api/server/request_throttle.go limit_handler (0.88)
auth/session_policy.rs enforce_policy (0.83)
90%+ Token Savings for AI Coding Tools
When an LLM reads your code with standard tools, it burns tokens on entire files just to understand one function. LeIndex returns only what matters — structured, context-aware results instead of raw file dumps.
| Task | Standard Tools | LeIndex | Savings |
|---|---|---|---|
| Understand a 500-line file | ~2,000 tokens | ~380 tokens | 81% |
| Find all callers of a function | ~5,800 tokens | ~420 tokens | 93% |
| Navigate project structure | ~8,500 tokens | ~650 tokens | 92% |
| Cross-file symbol rename | ~12,000 tokens | ~340 tokens | 97% |
Every tool call is context-aware — not atomic. When you look up a symbol, you don't just get its definition. You get its callers, callees, data dependencies, and impact radius. When you summarize a file, you get cross-file relationships that Read can never provide at any token cost. One LeIndex call replaces chains of Grep → Read → Read → Read.
See full benchmarks for methodology and detailed comparisons.
Quick Start (2 minutes)
Install
Via cargo (recommended):
Via install script:
The install script builds and installs both leindex and leindex-embed (ONNX worker), plus bundled model assets.
Via PyPI bootstrap wrapper:
The PyPI package installs a small Python launcher. On first run it installs or updates
the real Rust leindex binary in ~/.cargo/bin via cargo install leindex. If Cargo
is missing, the launcher explains the requirement and prompts to install Rust/Cargo when
automatic setup is supported on the current platform.
Via npm MCP wrapper (recommended for AI tools):
The npm package downloads a platform-specific bundle containing the main binary, the ONNX worker (leindex-embed), and model assets.
Environment Variables:
| Name | Required | Description | Default |
|---|---|---|---|
LEINDEX_HOME |
No | Override storage/index home directory | ~/.leindex |
LEINDEX_PORT |
No | Override HTTP server port | 47268 |
Index and search
# Index your project
# Search by meaning
# Deep structural analysis
That's it. You're searching by meaning.
What LeIndex Is Useful For
- Understanding unfamiliar codebases — ask questions instead of reading every file
- Onboarding new engineers — find relevant code without tribal knowledge
- Exploring legacy systems — surface logic buried in decades of code
- AI coding assistants — give LLMs real structural context via MCP
- Cross-project search — query across multiple repositories simultaneously
Built for AI-Assisted Development
Modern AI coding tools struggle with large codebases because they lack global structural context.
LeIndex provides that missing layer.
It builds a semantic index of your repository that both developers and AI assistants can query to understand:
- where logic lives
- how components interact
- what code paths enforce behavior
LeIndex runs as an MCP server, allowing tools like Claude Code, Cursor, and other MCP-compatible agents to explore your codebase with semantic understanding.
# Start MCP stdio mode (for Claude Code / Cursor)
# Or run the HTTP MCP server
Claude: "Where is request validation implemented?"
LeIndex MCP → src/http/request_validator.rs
src/middleware/input_guard.rs
How It Works
LeIndex builds a semantic index of your codebase using embeddings and structural analysis (tree-sitter parsing + program dependence graphs).
This allows queries to match:
- code intent — what the code does, not what it's named
- related logic paths — follow data flow and control flow
- implementation patterns — structural similarity across files
Indexes can span multiple repositories, enabling cross-project search.
Codebase → Tree-sitter Parser → PDG Builder → Semantic Index → Query Engine → Results
Features
- Semantic search — find code by meaning, not keywords
- PDG analysis — program dependence graph for structural understanding
- 5-phase analysis — additive multi-pass codebase analysis pipeline
- Cross-project indexing — search across multiple repos at once
- 16 MCP tools — read, analyze, edit preview/apply, rename, impact analysis
- HTTP + WebSocket server — available through the unified
leindexserver modules and commands - Dashboard — Bun + React operational UI with project metrics and graph telemetry
- Low resource mode — works on constrained hardware
- Built in Rust — fast indexing, low memory, safe concurrency
- Flexible embedding backends — choose between TF-IDF, local ONNX models, or remote cloud providers (OpenAI, Cohere)
Other Install Options
crates.io
PyPI
This package is a bootstrap wrapper for the Rust release. It keeps using the unified
leindex command, installs the binary into ~/.cargo/bin, and then forwards all CLI
arguments to the real Rust executable.
From source
This produces both target/release/leindex (main binary) and target/release/leindex-embed (ONNX worker). The worker must be discoverable alongside the main binary or in PATH for local ONNX inference.
Feature flags: Use --features to customize the build:
full(default) — Full library plus theleindexCLI binaryminimal— Library-focused parse/search build slice; does not produce theleindexbinary by itselfcli— Required feature for theleindexbinary targetserver— Enables the HTTP/WebSocket server library modules; combine withclifor a runnable binary
MCP Server Integration
For AI coding tools, the recommended integration path is the npm MCP wrapper so the client resolves the published MCP entrypoint directly:
If you intentionally installed the full Rust binary via cargo install leindex,
install.sh, or the PyPI bootstrapper, you can replace npx -y @leindex/mcp
with leindex mcp.
Every MCP tool is also available from the CLI bridge:
Add to ~/.config/zed/settings.json:
Add to Cursor settings (settings.json):
Requires the Model Context Protocol extension.
Configure in settings.json:
Add to ~/.claude/settings.json or project-local .claude/settings.json:
Optional guidance pack:
- Install the shared skill from
integrations/skills/leindex-toolkit/into~/.claude/skills/leindex-toolkit/ - Merge
integrations/claude-code/settings.example.jsonto add the LeIndex reminder hook
Add to ~/.config/amp/settings.json:
Add to ~/.config/opencode/opencode.json:
Add to ~/.qwen/settings.json:
Add to ~/.iflow/settings.json:
Add to ~/.factory/mcp.json (note: requires type: "stdio"):
Add to ~/.gemini/settings.json:
Agent guidance packs:
- Claude Code: shared skill plus reminder hook
- Codex: install
integrations/skills/leindex-toolkit/into~/.codex/skills/leindex-toolkit/ - Gemini CLI, Amp, OpenCode, Qwen, and iFlow: reuse the shared skill text as project instructions or agent rules
- Full instructions:
docs/AGENT_GUIDANCE.md
macOS: ~/Library/Application Support/Claude/claude_desktop_config.json
Windows: %APPDATA%\Claude\claude_desktop_config.json
Linux: ~/.config/Claude/claude_desktop_config.json
Dashboard (optional)
Memory Measurement and Profiling
Plan 0 adds a lightweight memory measurement foundation so you can track LeIndex's RSS behavior without wiring up custom scripts.
cargo xtask memcheckbuilds the release binary when needed, runs the canonicalsmall_repoworkload, compares the results against committed baselines and budget ceilings, and exits non-zero on regressions.- The Linux CI workflow in
.github/workflows/memory-budget.ymlruns the same memcheck path and uploads the report artifact so baseline and budget enforcement stay consistent in automation. --memory-report PATHandLEINDEX_MEMORY_REPORT=PATHopt into a compact shutdown JSON with peak RSS and phase summaries; they stay off by default for normal runs.- Build with
--features memprofto enable the optional heap profiling surface for deeper memory investigations when the lightweight report is not enough.
CLI Reference
Output Behavior
LeIndex is designed for token-efficient operation when used with AI coding tools.
Clean Terminal Output
- Default log level:
WARN— Routine operational chatter (storage paths, cache hits, PDG node counts, indexing progress) is suppressed. Only warnings and errors are shown. - Enable verbose diagnostics: pass
--verboseor setRUST_LOG=debugto see full DEBUG-level output for troubleshooting.
This keeps the terminal clean and minimizes token usage when LeIndex runs as a subprocess (e.g., via MCP stdio).
Structured MCP Responses
MCP responses are framed and structured — transport-level errors (connection drops, protocol issues) never leak into the JSON-RPC response stream. The leindex mcp stdio mode produces clean, parseable JSON-RPC responses suitable for LLM consumption.
Winit Event-Loop Coverage
leindex analyze and leindex context expand on-demand even when symbol names differ from query terms. If an exact lookup fails, LeIndex performs a fuzzy scan of the project's PDG to discover event-loop-heavy entrypoints (e.g., run_event_loop, EventLoop::run, main) using case-insensitive substring matching with complexity-aware scoring. This ensures framework-heavy codebases remain discoverable without requiring exact symbol names.
Embedding Configuration
LeIndex supports multiple embedding backends for semantic search:
Local ONNX Models (default)
Build with the default features to use local Qwen3 embedding models via ONNX Runtime. LeIndex uses a worker-sidecar architecture — the main leindex process delegates ONNX inference to a separate leindex-embed worker process, keeping the main daemon lightweight.
Local models provide:
- Privacy (data never leaves your machine)
- No API costs
- Zero network latency
- Support for Qwen3-Embedding-0.6B and optional Qwen3-Reranker-0.6B
- Worker-sidecar ONNX inference keeps main process memory low
The worker binary (leindex-embed) is built alongside the main binary and is discovered automatically at runtime. Bundled model assets are shipped in the models/ directory next to the binaries.
Remote Cloud Providers
Build with the remote-embeddings feature to use cloud-based embedding services:
Supported providers:
- OpenAI (
text-embedding-3-small,text-embedding-3-large) - Cohere (
embed-english-v3.0,embed-multilingual-v3.0) - Custom (any OpenAI-compatible endpoint)
Configure via environment variables:
# OpenAI
# LeIndex will automatically use OpenAI embeddings
# Cohere
# LeIndex will automatically use Cohere embeddings
# Custom provider
Remote embeddings offer:
- Higher accuracy with state-of-the-art models
- No local resource requirements
- Automatic model updates
- Multi-language support (Cohere)
Note: Remote embeddings require network access and API keys from your provider.
TF-IDF Fallback
If no embedding backend is configured, LeIndex falls back to TF-IDF for keyword-based search. This is lightweight and works offline but lacks semantic understanding.
MCP Tools (16)
| Tool | Purpose |
|---|---|
leindex_index |
Index a project |
leindex_search |
Semantic code search |
leindex_deep_analyze |
Deep analysis with PDG traversal |
leindex_context |
Expand context around a symbol |
leindex_phase_analysis |
5-phase additive analysis |
leindex_file_summary |
Structural file analysis |
leindex_symbol_lookup |
Symbol definition + callers/callees |
leindex_project_map |
Annotated project structure |
leindex_grep_symbols |
Structural symbol search |
leindex_read_symbol |
Read symbol source with deps |
leindex_edit_preview |
Preview edits with impact report |
leindex_edit_apply |
Apply code edits |
leindex_rename_symbol |
Rename across all references |
leindex_impact_analysis |
Blast radius analysis |
leindex_diagnostics |
Index health and stats |
phase_analysis |
Alias for phase analysis |
Unified Module Layout
LeIndex is now a single crate with feature-gated modules:
| Module | Role |
|---|---|
parse |
Language parsing and signature extraction |
graph |
Graph construction and traversal |
search |
Retrieval, scoring, vector search |
storage |
SQLite persistence + storage |
phase |
Additive phase analysis pipeline |
cli |
CLI + MCP protocol handlers |
global |
Cross-project discovery/registry |
server |
HTTP/WebSocket API server |
edit |
Edit preview/apply support |
validation |
Validation and guardrails |
Legacy crate-style aliases remain available from leindex::leparse, leindex::legraphe, and similar compatibility re-exports.
Security
Database discovery (LEINDEX_DISCOVERY_ROOTS) is opt-in only. Sensitive directories (.ssh, .aws, .gnupg, etc.) are automatically excluded. All SQL operations use parameterized queries. See ARCHITECTURE.md for details.
Docs
- ARCHITECTURE.md — system design and internals
- API.md — HTTP API reference
- docs/MCP.md — MCP server documentation
- dashboard/README.md — dashboard setup
License
MIT