trusty-search 0.20.4

Machine-wide hybrid code search service: BM25 + vector + KG, zero cold-start, MCP server
Documentation

trusty-search

CI crates.io License: ELv2

Machine-wide, blazingly fast hybrid code search service. One install per machine, one always-on daemon, unlimited named indexes.

📚 Documentation

Full documentation lives at the workspace top level in docs/trusty-search/: the research and regression-testing indexes, engineering sessions, and the example multi-index config. This README and the rustdoc stay in-crate; everything else lives under docs/.

System requirements

  • Rust 1.75+ (for source builds)
  • 16 GB RAM minimum (default) — hard-checked at daemon startup. The daemon exits with an actionable error message on under-spec hosts. Set TRUSTY_SKIP_RAM_CHECK=1 in the daemon environment to bypass this check for small workloads where peak RAM is known to stay well under the memory limit. Bypass at your own risk on large corpora — the default exists because realistic indexing OOMs without it.
  • macOS 12+ or Linux (Windows: not yet supported)
  • ~2 GB disk for model cache (downloaded on first run to ~/Library/Caches/trusty-search/ on macOS or $XDG_DATA_HOME/trusty-search/ on Linux)

Install

From crates.io (recommended)

cargo install trusty-search

From source

git clone https://github.com/bobmatnyc/trusty-search
cd trusty-search
cargo install --path . --locked

Apple Silicon

CoreML GPU acceleration is enabled automatically on M1/M2/M3/M4. No flags or extra installs are needed. The startup log confirms the active provider:

embedder initialized: model=AllMiniLML6V2(Q) dim=384 provider=CoreML (Metal GPU / ANE)

NVIDIA GPU (CUDA)

cargo install trusty-search --features cuda

Requires CUDA toolkit installed on the host. See CLAUDE.md for ORT_DYLIB_PATH setup on Amazon Linux 2023 and other glibc 2.34 hosts.

Quick start

The following five steps take you from zero to a running search in under five minutes.

Step 1 — Start the daemon

trusty-search start

Expected output:

trusty-search daemon starting on http://127.0.0.1:<port>
embedder initialized: model=AllMiniLML6V2(Q) dim=384 provider=CoreML (Metal GPU / ANE)
daemon ready

The daemon auto-selects a free port and writes it to ~/Library/Application Support/trusty-search/port.lock.

Step 2 — Index a project

trusty-search index ~/Projects/myproj --name myproj

Expected output:

Registered index "myproj" at /Users/me/Projects/myproj
⟳ Indexing myproj [████████░░] 1204/1520 files — 12s remaining
✓ Indexed 14 823 chunks in 142s

Re-running is safe — unchanged files are skipped via content fingerprints. Use --force to rebuild from scratch.

Step 3 — Run a search

trusty-search query "fn authenticate" --index myproj

Expected output:

1. src/auth.rs:42 — authenticate (hybrid+kg, score=0.018)
   fn authenticate(ctx: &Context) -> Result<Token> {
2. src/middleware.rs:17 — verify_token (hybrid, score=0.011)
   ...

Add --json for machine-readable output.

Step 4 — Open the admin UI

trusty-search ui

Opens http://127.0.0.1:<port>/ui in your browser. The UI provides search, index management, and an OpenRouter-backed chat panel (requires OPENROUTER_API_KEY).

Step 5 — Check status at any time

trusty-search status   # daemon version, port, per-index chunk counts
trusty-search doctor   # 6-check diagnostic; add --fix to auto-repair

Using with Claude Code

Add trusty-search as an MCP server in your Claude Code config (~/.claude/claude_desktop_config.json or via claude mcp add):

stdio (recommended)

{
  "mcpServers": {
    "trusty-search": {
      "command": "trusty-search",
      "args": ["serve"]
    }
  }
}

HTTP/SSE

trusty-search serve --http 127.0.0.1:7879

Then add http://127.0.0.1:7879/sse as an SSE MCP endpoint in your Claude Code config.

Once connected, Claude Code can call search, index_file, list_indexes, and 15 other tools directly (18 total). The daemon must be running independently (trusty-search start) before Claude Code connects.

Features

  • Machine-wide daemon — single install (cargo install trusty-search), one process, unlimited registered indexes via DashMap<IndexId, IndexHandle>
  • Hybrid search — BM25 (lexical, zero-dep port with camelCase / snake_case splitting) + HNSW vector (usearch 2.25, all-MiniLM-L6-v2 INT8) + Knowledge Graph 1–2 hop expansion, fused via Reciprocal Rank Fusion (k = 60, always-on)
  • Query intent routing — sub-ms regex classifier routes every query to one of 5 intents and adjusts α / β weights and KG gating per query
  • Branch-aware search — pass branch_files (or just branch: "feature/foo") to POST /indexes/:id/search; chunks from your branch get a configurable score boost (default 1.5×) and every result carries on_branch: bool
  • KG symbol graph — petgraph-backed SymbolGraph derived from tree-sitter parses, with EdgeKind (CALLS / IMPORTS / INHERITS / CONTAINS) score multipliers; KG expansion is intent-gated (Usage only)
  • Auto-tuned memory tiers — 5 tiers (Tiny / Small / Medium / Large / XLarge) from < 8 GB up to 64+ GB; chunk caps, batch sizes, cache sizes, and BM25 / KG limits computed at daemon startup from detected RAM
  • macOS CoreML auto-detection — on Apple Silicon the ONNX session registers the CoreML execution provider automatically (no --features flag needed since v0.3.13)
  • Multi-index repo support — drop a trusty-search.yaml at the repo root to define per-directory named indexes; trusty-search index reads it automatically (see docs/trusty-search/examples/trusty-search.yaml)
  • Incremental reindex — sha2 content fingerprints skip unchanged files across daemon restarts; --force triggers a full rebuild
  • Zero cold-start queries — HNSW kept hot (Duration::MAX cool-after), LRU embedding cache (256+ entries) skips re-embedding on repeat queries
  • Native multi-requestArc<SearchAppState>, reader-priority RwLock, axum HTTP/2 — many concurrent searches against the same index never block
  • MCP server — stdio + HTTP/SSE transports, 18 tools (per src/mcp/tools.rs), drop-in for Claude Code
  • Embedded Svelte 5 admin UI — Collections, Search, Chat, Admin panels compiled into the binary via include_dir!; open with trusty-search ui
  • Migration pathtrusty-search convert reads mcp-vector-search configs and re-registers each project as a named index

Code quality analysis: Complexity hotspots, smell detection, and quality grades have moved to trusty-analyze. The complexity_hotspots, smells, and quality HTTP endpoints are not served from this binary as of v0.2.0.

Stage 1 IS a daemonized ripgrep

A lexical_only index skips embedding entirely. You get BM25 ranking plus grep-speed pattern matching via a persistent HTTP daemon — no ONNX, no GPU, no model download.

Certified performance on a 1,155-file Rust workspace (trusty-tools, May 2026):

Metric Value
Reindex time 5.3 s (5,289 ms)
Throughput 4,445 chunks/sec
Peak daemon RSS 698 MB
/grep P50 latency 8 ms (vs ripgrep 9 ms — parity)

Full measurement details: docs/trusty-search/regression-testing/v0.14.0-stage1-cert-2026-05-27.md

When to use lexical-only: when you want a daemonized BM25 + ripgrep with HTTP/MCP integration but do not need semantic similarity queries. Reindex is 63× faster than a full hybrid reindex (no embedding), and the daemon fits comfortably in 700 MB.

How to enable — pass lexical_only: true in the index create payload:

curl -s -X POST http://127.0.0.1:7878/indexes \
    -H 'Content-Type: application/json' \
    -d '{"id":"myproject","root_path":"/path/to/project","lexical_only":true}'

Or use the --lexical-only flag with the CLI:

trusty-search index /path/to/project --name myproject --lexical-only

Skip-KG mode (--no-kg) — issue #313

A skip_kg index runs Stages 1 and 2 (BM25 + vector embed) normally but permanently skips the Phase 3 Knowledge Graph rebuild (tree-sitter symbol extraction + petgraph construction). Useful for large documentation-heavy or generated-code sub-indexes in polyrepos where call-chain navigation is never needed.

Savings per index: ~50–100 MB heap (symbol graph not allocated), ~400 ms per reindex (tree-sitter extraction pass skipped).

503 contract: GET /indexes/:id/call_chain returns a structured 503 error when skip_kg=true:

{ "error": "kg_unavailable", "reason": "skipped_by_config", "index": "myproject" }

Callers must handle 503 and not assume 404 (index absent).

Three ways to enable:

CLI (--no-kg — orthogonal to --lexical-only):

trusty-search index /path/to/project --name myproject --no-kg

YAML (trusty-search.yaml):

version: 1
indexes:
  - name: docs
    paths: [docs/]
    skip_kg: true

HTTP API:

curl -s -X POST http://127.0.0.1:7878/indexes \
    -H 'Content-Type: application/json' \
    -d '{"id":"myproject","root_path":"/path/to/project","skip_kg":true}'

Machine-wide default (TRUSTY_NO_KG=1 env var applies to every new index):

export TRUSTY_NO_KG=1
trusty-search index /path/to/project --name myproject

skip_kg and lexical_only are orthogonal (D1) — setting both suppresses both the embedder (Stage 2) and the KG rebuild (Stage 3), leaving only BM25.

Memory tiers (auto-tuned at startup)

MEMORY_LIMIT_MB is computed dynamically as 25% of detected system RAM, clamped to 1–64 GB. It is not a fixed tier value. The env var TRUSTY_MEMORY_LIMIT_MB overrides it. All other limits below are tier-based.

Tier Total RAM MEMORY_LIMIT_MB MAX_CHUNKS EMBEDDING_CACHE MAX_BATCH_SIZE BM25_CORPUS_CAP MAX_KG_NODES
Tiny < 8 GB 25% of RAM (≥ 1 GB) 50 000 500 64 20 000 30 000
Small 8–15 GB 25% of RAM 100 000 1 000 128 50 000 75 000
Medium 16–31 GB 25% of RAM 200 000 5 000 256 100 000 150 000
Large 32–63 GB 25% of RAM 400 000 10 000 512 200 000 300 000
XLarge ≥ 64 GB 25% of RAM (≤ 64 GB) 800 000 20 000 512 400 000 500 000

Env vars (TRUSTY_MAX_CHUNKS, TRUSTY_EMBEDDING_CACHE, TRUSTY_MAX_BATCH_SIZE, TRUSTY_BM25_CORPUS_CAP, TRUSTY_MAX_KG_NODES, TRUSTY_MEMORY_LIMIT_MB, TRUSTY_COREML_BATCH_SIZE, TRUSTY_COREML_TRIPWIRE_MB) always override the tier default. Precedence: shell env > daemon.env > tier default. The resolved tier and all limits are logged at daemon startup.

Apple Silicon CoreML batch sizing

On Apple Silicon (M1–M4), the ONNX Runtime CoreML execution provider batches are optimised separately from CPU and GPU tiers:

  • DEFAULT_COREML_BATCH_SIZE = 32 — optimal for Apple Neural Engine (ANE). Benchmark results on a 19k-chunk corpus show that larger batches (64, 128) consume 7–10% more time and 1.2–9.7 GB additional peak RSS with zero throughput gain. The ANE has a fixed dispatch budget; batch size scales unified-memory allocation but not per-call throughput.
  • TRUSTY_COREML_TRIPWIRE_MB = 4096 — safety net for RSS spikes. If a single CoreML embedding batch increases RSS by >4 GB, the batch size is automatically halved (floor: 1) and a warning is logged. Fires once per reindex. Override with TRUSTY_COREML_TRIPWIRE_MB env var if your host has different memory pressure characteristics.
  • Non-fatal RSS probes: failure to read /proc/self/status returns 0, disabling the tripwire gracefully rather than crashing.

Query intent → routing weights

Intent α (vector) β (BM25) KG-first
Definition 0.3 0.7 false
Usage 0.5 0.5 true
Conceptual 0.8 0.2 false
BugDebt 0.1 0.9 false
Unknown 0.6 0.4 false

The classifier is a sub-ms regex over the query text. KG expansion is gated to Usage intent only — caller/callee chains are scored at 70% of the trigger chunk's RRF score.

CLI

trusty-search start                                  # start HTTP daemon (background)
trusty-search start --data-dir <PATH>                # start with custom data dir (TRUSTY_DATA_DIR)
                                                     # enables isolated daemon instances; each instance
                                                     # gets its own data dir, port, and index registry
trusty-search start --no-auto-discover               # skip startup auto-discovery scan
                                                     # (also: TRUSTY_NO_AUTO_DISCOVER=1)
                                                     # daemon serves only already-registered indexes
trusty-search stop                                   # stop daemon (SIGTERM via PID lockfile)
trusty-search index [path] [--name <id>] [--force]   # register + index (primary command)
                                                     # auto-detects ./trusty-search.yaml
trusty-search query <text> [--index <id>] [--top-k N] [--json]
trusty-search status                                 # daemon + index overview (alias: health)
trusty-search doctor [--fix]                         # 6-check diagnostic + auto-repair
trusty-search ui [--port N]                          # open web management UI in browser
trusty-search convert project|all [--dry-run]        # migrate from mcp-vector-search
trusty-search serve [--http <addr>]                  # MCP stdio (default) or HTTP/SSE
# Aliases preserved for backward compatibility:
trusty-search init [path]                            # alias for index
trusty-search reindex [path]                         # alias for index --force

MCP tools

The MCP server registers 18 tools (authoritative source: src/mcp/tools.rs tool_definitions):

Tool Description
search Hybrid search (BM25 + HNSW + KG, RRF-fused)
search_kg KG-first graph-walk search; accepts optional refine_query (see below)
search_semantic Vector-only semantic search lane
search_lexical BM25/token lexical search lane
search_all Fan-out search across every registered index
search_similar Code-to-code similarity from a seed file/function
index_file Add or replace a single file in the index
remove_file Remove a file and all its chunks
list_indexes Enumerate all registered indexes
create_index Register a new (empty) index
delete_index Drop an index from the registry
reindex Fire-and-forget full reindex (SSE progress)
index_status Per-index stats including walk diagnostics (see below)
list_chunks Paginated enumeration of chunks (file, start_line)
get_call_chain KG caller/callee chain for a symbol
grep Literal/regex grep fallback over the corpus
search_health Daemon liveness probe
chat OpenRouter Q&A with auto-injected search context

search_kgrefine_query parameter (issue #147)

search_kg performs a graph-walk expanding the KG neighbourhood of each top hit. When the seed chunk is a weak or wrong match, the unfiltered neighbourhood can compound the error with unrelated results.

Pass an optional refine_query string to describe the target concept in natural language. The daemon embeds both the refine_query and every KG-expanded neighbour, then discards neighbours whose cosine similarity against refine_query is below 0.4. Surviving neighbours are re-ranked by cosine score so the strongest semantic match appears first. Seeds from the primary fused list are never filtered.

{
  "tool": "search_kg",
  "index_id": "myproj",
  "query": "authenticate",
  "refine_query": "JWT token validation and expiry checking"
}

When refine_query is absent the behaviour is identical to the previous version (fully backward-compatible).

index_status — walk diagnostic fields (issue #280)

GET /indexes/:id/status (and the index_status MCP tool) now include four fields that let operators diagnose why a reindex produced zero chunks:

Field Type Description
last_walk_started_at string | null RFC 3339 timestamp of the most recent walk start
last_walk_files_seen number Files discovered by the walk (after gitignore/extension filtering)
last_walk_files_skipped number Directories skipped (gitignore, build artefacts, etc.)
last_walk_error string | null Set when the walk found zero indexable files; describes probable cause

These fields are populated every time a reindex task runs. On a healthy index with chunks you will see last_walk_error: null and last_walk_files_seen > 0.

Stack

Component Choice
Language Rust 2021
Async runtime tokio (full features)
HTTP axum 0.7 + tower-http (CORS, trace, gzip), HTTP/2
Vector store usearch 2.25 (HNSW, in-memory, Arc<RwLock<>>)
Embeddings fastembed 5.x (ONNX, all-MiniLM-L6-v2 INT8, 384-dim)
Lexical Custom BM25 (zero-dep port, camelCase splitting)
KV store redb 2.6
Knowledge graph petgraph 0.6 (SymbolGraph)
File watching notify 6 + notify-debouncer-mini 0.4 (500 ms)
Code parsing tree-sitter 0.26 (14 grammars)
Concurrency dashmap 5, lru 0.12, rayon 1
HTTP client reqwest 0.12 (rustls-tls)
CLI clap 4 (derive)
UI Svelte 5, embedded via include_dir!
Hashing sha2 (incremental reindex fingerprints)

Troubleshooting

Daemon won't start

Run trusty-search doctor for a 6-check diagnostic. Common causes:

  • Another daemon already running: trusty-search stop then trusty-search start
  • Stale PID lockfile: trusty-search doctor --fix removes it automatically
  • Less than 16 GB RAM: the daemon performs a hard RAM check and exits with an actionable error. Set TRUSTY_SKIP_RAM_CHECK=1 in the daemon environment to bypass for small workloads; not recommended on large corpora (risk of OOM during indexing)

Embedder stuck on "initializing"

The ONNX Runtime initializes the model on first start and may take 30–60 seconds on slower machines. If it hangs indefinitely, increase the timeout:

TRUSTY_EMBEDDER_INIT_TIMEOUT_SECS=120 trusty-search start

High memory usage during reindex

The daemon has a soft RSS ceiling (TRUSTY_MEMORY_LIMIT_MB). When hit, it skips remaining batches and logs a warning. Already-committed chunks stay searchable. To lower pressure:

TRUSTY_MEMORY_LIMIT_MB=2048 trusty-search start

Or wait for the soft cap to trip — the partial index is usable immediately.

Reindex produced zero chunks

If index_status shows chunk_count: 0 after a reindex, check the walk diagnostic fields:

# Via CLI (pipe through jq if available)
trusty-search status --index myproj

# Via HTTP
curl http://127.0.0.1:<port>/indexes/myproj/status | jq .

Look for last_walk_error. Common causes and fixes:

last_walk_error message Cause Fix
root path does not exist: /… Index was registered with a path that no longer exists Re-register with the correct path: trusty-search index /new/path --name myproj
walk produced zero files … check gitignore rules All discovered files were excluded by .gitignore, extension allow-list, or path_filter Check .gitignore for overly broad rules; ensure at least one supported extension (.rs, .py, .ts, etc.) exists under the root path

If last_walk_error is null but chunk_count is still 0, the walk found files but the chunker produced no output — this usually means all files are binary or exceed the size limit. Check RUST_LOG=debug trusty-search start for per-file warnings.

Port conflict

The daemon auto-selects a free port on each start. The live port is written to:

  • macOS: ~/Library/Application Support/trusty-search/port.lock
  • Linux: $XDG_DATA_HOME/trusty-search/port.lock

If trusty-search status reports the wrong port, stop and restart the daemon.

Device flag not persisting across restarts

Use trusty-search start --device cpu to force CPU mode. The flag is persisted to daemon.env so it survives daemon restarts.

Architecture and HTTP API

See CLAUDE.md for the full HTTP endpoint catalogue, query pipeline, multi-request design, memory tuning reference, and release process.

Documentation

License

Elastic License 2.0 (ELv2) — free for internal use; you may not provide trusty-search as a hosted or managed service to third parties without a commercial agreement. See LICENSE for the full terms.