trusty-search 0.20.4

Machine-wide hybrid code search service: BM25 + vector + KG, zero cold-start, MCP server
Documentation
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
# trusty-search

[![CI](https://github.com/bobmatnyc/trusty-search/actions/workflows/ci.yml/badge.svg)](https://github.com/bobmatnyc/trusty-search/actions/workflows/ci.yml)
[![crates.io](https://img.shields.io/crates/v/trusty-search.svg)](https://crates.io/crates/trusty-search)
[![License: ELv2](https://img.shields.io/badge/License-Elastic%20License%202.0-blue.svg)](./LICENSE)

Machine-wide, blazingly fast hybrid code search service. One install per machine,
one always-on daemon, unlimited named indexes.

## 📚 Documentation

Full documentation lives at the workspace top level in
[`docs/trusty-search/`](../../docs/trusty-search/): the
[research](../../docs/trusty-search/research/) and
[regression-testing](../../docs/trusty-search/regression-testing/) indexes,
engineering [sessions](../../docs/trusty-search/sessions/), and the
[example multi-index config](../../docs/trusty-search/examples/trusty-search.yaml).
This README and the rustdoc stay in-crate; everything else lives under `docs/`.

## System requirements

- **Rust 1.75+** (for source builds)
- **16 GB RAM minimum (default)** — hard-checked at daemon startup. The daemon exits with an actionable error message on under-spec hosts. Set `TRUSTY_SKIP_RAM_CHECK=1` in the daemon environment to bypass this check for small workloads where peak RAM is known to stay well under the memory limit. Bypass at your own risk on large corpora — the default exists because realistic indexing OOMs without it.
- **macOS 12+ or Linux** (Windows: not yet supported)
- **~2 GB disk** for model cache (downloaded on first run to `~/Library/Caches/trusty-search/` on macOS or `$XDG_DATA_HOME/trusty-search/` on Linux)

## Install

### From crates.io (recommended)

```bash
cargo install trusty-search
```

### From source

```bash
git clone https://github.com/bobmatnyc/trusty-search
cd trusty-search
cargo install --path . --locked
```

### Apple Silicon

CoreML GPU acceleration is enabled automatically on M1/M2/M3/M4. No flags or extra installs are needed. The startup log confirms the active provider:

```
embedder initialized: model=AllMiniLML6V2(Q) dim=384 provider=CoreML (Metal GPU / ANE)
```

### NVIDIA GPU (CUDA)

```bash
cargo install trusty-search --features cuda
```

Requires CUDA toolkit installed on the host. See [CLAUDE.md](./CLAUDE.md) for `ORT_DYLIB_PATH` setup on Amazon Linux 2023 and other glibc 2.34 hosts.

## Quick start

The following five steps take you from zero to a running search in under five minutes.

**Step 1 — Start the daemon**

```bash
trusty-search start
```

Expected output:
```
trusty-search daemon starting on http://127.0.0.1:<port>
embedder initialized: model=AllMiniLML6V2(Q) dim=384 provider=CoreML (Metal GPU / ANE)
daemon ready
```

The daemon auto-selects a free port and writes it to `~/Library/Application Support/trusty-search/port.lock`.

**Step 2 — Index a project**

```bash
trusty-search index ~/Projects/myproj --name myproj
```

Expected output:
```
Registered index "myproj" at /Users/me/Projects/myproj
⟳ Indexing myproj [████████░░] 1204/1520 files — 12s remaining
✓ Indexed 14 823 chunks in 142s
```

Re-running is safe — unchanged files are skipped via content fingerprints. Use `--force` to rebuild from scratch.

**Step 3 — Run a search**

```bash
trusty-search query "fn authenticate" --index myproj
```

Expected output:
```
1. src/auth.rs:42 — authenticate (hybrid+kg, score=0.018)
   fn authenticate(ctx: &Context) -> Result<Token> {
2. src/middleware.rs:17 — verify_token (hybrid, score=0.011)
   ...
```

Add `--json` for machine-readable output.

**Step 4 — Open the admin UI**

```bash
trusty-search ui
```

Opens `http://127.0.0.1:<port>/ui` in your browser. The UI provides search, index management, and an OpenRouter-backed chat panel (requires `OPENROUTER_API_KEY`).

**Step 5 — Check status at any time**

```bash
trusty-search status   # daemon version, port, per-index chunk counts
trusty-search doctor   # 6-check diagnostic; add --fix to auto-repair
```

## Using with Claude Code

Add trusty-search as an MCP server in your Claude Code config (`~/.claude/claude_desktop_config.json` or via `claude mcp add`):

### stdio (recommended)

```json
{
  "mcpServers": {
    "trusty-search": {
      "command": "trusty-search",
      "args": ["serve"]
    }
  }
}
```

### HTTP/SSE

```bash
trusty-search serve --http 127.0.0.1:7879
```

Then add `http://127.0.0.1:7879/sse` as an SSE MCP endpoint in your Claude Code config.

Once connected, Claude Code can call `search`, `index_file`, `list_indexes`, and 15 other tools directly (18 total). The daemon must be running independently (`trusty-search start`) before Claude Code connects.

## Features

- **Machine-wide daemon** — single install (`cargo install trusty-search`),
  one process, unlimited registered indexes via `DashMap<IndexId, IndexHandle>`
- **Hybrid search** — BM25 (lexical, zero-dep port with camelCase / snake_case
  splitting) + HNSW vector (usearch 2.25, all-MiniLM-L6-v2 INT8) + Knowledge
  Graph 1–2 hop expansion, fused via Reciprocal Rank Fusion (k = 60, always-on)
- **Query intent routing** — sub-ms regex classifier routes every query to one
  of 5 intents and adjusts α / β weights and KG gating per query
- **Branch-aware search** — pass `branch_files` (or just `branch: "feature/foo"`) to
  `POST /indexes/:id/search`; chunks from your branch get a configurable score boost
  (default 1.5×) and every result carries `on_branch: bool`
- **KG symbol graph** — petgraph-backed `SymbolGraph` derived from tree-sitter
  parses, with `EdgeKind` (CALLS / IMPORTS / INHERITS / CONTAINS) score
  multipliers; KG expansion is intent-gated (Usage only)
- **Auto-tuned memory tiers** — 5 tiers (Tiny / Small / Medium / Large / XLarge)
  from < 8 GB up to 64+ GB; chunk caps, batch sizes, cache sizes, and BM25 /
  KG limits computed at daemon startup from detected RAM
- **macOS CoreML auto-detection** — on Apple Silicon the ONNX session
  registers the CoreML execution provider automatically (no `--features`
  flag needed since v0.3.13)
- **Multi-index repo support** — drop a `trusty-search.yaml` at the repo root
  to define per-directory named indexes; `trusty-search index` reads it
  automatically (see [`docs/trusty-search/examples/trusty-search.yaml`]../../docs/trusty-search/examples/trusty-search.yaml)
- **Incremental reindex** — sha2 content fingerprints skip unchanged files
  across daemon restarts; `--force` triggers a full rebuild
- **Zero cold-start queries** — HNSW kept hot (`Duration::MAX` cool-after),
  LRU embedding cache (256+ entries) skips re-embedding on repeat queries
- **Native multi-request** — `Arc<SearchAppState>`, reader-priority `RwLock`,
  axum HTTP/2 — many concurrent searches against the same index never block
- **MCP server** — stdio + HTTP/SSE transports, 18 tools (per `src/mcp/tools.rs`), drop-in for Claude Code
- **Embedded Svelte 5 admin UI** — Collections, Search, Chat, Admin panels
  compiled into the binary via `include_dir!`; open with `trusty-search ui`
- **Migration path** — `trusty-search convert` reads `mcp-vector-search`
  configs and re-registers each project as a named index

> **Code quality analysis:** Complexity hotspots, smell detection, and quality grades
> have moved to [trusty-analyze]../trusty-analyze.
> The `complexity_hotspots`, `smells`, and `quality` HTTP endpoints are not served
> from this binary as of v0.2.0.

## Stage 1 IS a daemonized ripgrep

A `lexical_only` index skips embedding entirely. You get BM25 ranking plus
grep-speed pattern matching via a persistent HTTP daemon — no ONNX, no GPU,
no model download.

**Certified performance on a 1,155-file Rust workspace (trusty-tools, May 2026):**

| Metric | Value |
|--------|-------|
| Reindex time | 5.3 s (5,289 ms) |
| Throughput | 4,445 chunks/sec |
| Peak daemon RSS | 698 MB |
| `/grep` P50 latency | 8 ms (vs ripgrep 9 ms — parity) |

Full measurement details: [`docs/trusty-search/regression-testing/v0.14.0-stage1-cert-2026-05-27.md`](../../docs/trusty-search/regression-testing/v0.14.0-stage1-cert-2026-05-27.md)

**When to use lexical-only**: when you want a daemonized BM25 + ripgrep with
HTTP/MCP integration but do not need semantic similarity queries. Reindex is
63× faster than a full hybrid reindex (no embedding), and the daemon fits
comfortably in 700 MB.

**How to enable** — pass `lexical_only: true` in the index create payload:

```bash
curl -s -X POST http://127.0.0.1:7878/indexes \
    -H 'Content-Type: application/json' \
    -d '{"id":"myproject","root_path":"/path/to/project","lexical_only":true}'
```

Or use the `--lexical-only` flag with the CLI:

```bash
trusty-search index /path/to/project --name myproject --lexical-only
```

### Skip-KG mode (`--no-kg`) — issue #313

A `skip_kg` index runs Stages 1 and 2 (BM25 + vector embed) normally but
permanently skips the Phase 3 Knowledge Graph rebuild (tree-sitter symbol
extraction + petgraph construction). Useful for large documentation-heavy or
generated-code sub-indexes in polyrepos where call-chain navigation is never
needed.

**Savings per index:** ~50–100 MB heap (symbol graph not allocated), ~400 ms
per reindex (tree-sitter extraction pass skipped).

**503 contract:** `GET /indexes/:id/call_chain` returns a structured 503 error
when `skip_kg=true`:
```json
{ "error": "kg_unavailable", "reason": "skipped_by_config", "index": "myproject" }
```
Callers must handle 503 and not assume 404 (index absent).

**Three ways to enable:**

CLI (`--no-kg` — orthogonal to `--lexical-only`):
```bash
trusty-search index /path/to/project --name myproject --no-kg
```

YAML (`trusty-search.yaml`):
```yaml
version: 1
indexes:
  - name: docs
    paths: [docs/]
    skip_kg: true
```

HTTP API:
```bash
curl -s -X POST http://127.0.0.1:7878/indexes \
    -H 'Content-Type: application/json' \
    -d '{"id":"myproject","root_path":"/path/to/project","skip_kg":true}'
```

Machine-wide default (`TRUSTY_NO_KG=1` env var applies to every new index):
```bash
export TRUSTY_NO_KG=1
trusty-search index /path/to/project --name myproject
```

`skip_kg` and `lexical_only` are orthogonal (D1) — setting both suppresses
both the embedder (Stage 2) and the KG rebuild (Stage 3), leaving only BM25.

## Memory tiers (auto-tuned at startup)

`MEMORY_LIMIT_MB` is computed dynamically as **25% of detected system RAM, clamped to 1–64 GB**. It is not a fixed tier value. The env var `TRUSTY_MEMORY_LIMIT_MB` overrides it. All other limits below are tier-based.

| Tier   | Total RAM  | `MEMORY_LIMIT_MB`     | `MAX_CHUNKS` | `EMBEDDING_CACHE` | `MAX_BATCH_SIZE` | `BM25_CORPUS_CAP` | `MAX_KG_NODES` |
|--------|------------|-----------------------|--------------|-------------------|------------------|-------------------|----------------|
| Tiny   | < 8 GB     | 25% of RAM (≥ 1 GB)   | 50 000       | 500               | 64               | 20 000            | 30 000         |
| Small  | 8–15 GB    | 25% of RAM            | 100 000      | 1 000             | 128              | 50 000            | 75 000         |
| Medium | 16–31 GB   | 25% of RAM            | 200 000      | 5 000             | 256              | 100 000           | 150 000        |
| Large  | 32–63 GB   | 25% of RAM            | 400 000      | 10 000            | 512              | 200 000           | 300 000        |
| XLarge | ≥ 64 GB    | 25% of RAM (≤ 64 GB)  | 800 000      | 20 000            | 512              | 400 000           | 500 000        |

Env vars (`TRUSTY_MAX_CHUNKS`, `TRUSTY_EMBEDDING_CACHE`, `TRUSTY_MAX_BATCH_SIZE`,
`TRUSTY_BM25_CORPUS_CAP`, `TRUSTY_MAX_KG_NODES`, `TRUSTY_MEMORY_LIMIT_MB`,
`TRUSTY_COREML_BATCH_SIZE`, `TRUSTY_COREML_TRIPWIRE_MB`)
always override the tier default. Precedence: shell env > `daemon.env` >
tier default. The resolved tier and all limits are logged at daemon startup.

### Apple Silicon CoreML batch sizing

On Apple Silicon (M1–M4), the ONNX Runtime CoreML execution provider batches
are optimised separately from CPU and GPU tiers:

- **`DEFAULT_COREML_BATCH_SIZE = 32`** — optimal for Apple Neural Engine (ANE).
  Benchmark results on a 19k-chunk corpus show that larger batches (64, 128)
  consume 7–10% more time and 1.2–9.7 GB additional peak RSS with zero
  throughput gain. The ANE has a fixed dispatch budget; batch size scales
  unified-memory allocation but not per-call throughput.
- **`TRUSTY_COREML_TRIPWIRE_MB = 4096`** — safety net for RSS spikes. If a single
  CoreML embedding batch increases RSS by >4 GB, the batch size is automatically
  halved (floor: 1) and a warning is logged. Fires once per reindex.
  Override with `TRUSTY_COREML_TRIPWIRE_MB` env var if your host has different
  memory pressure characteristics.
- Non-fatal RSS probes: failure to read `/proc/self/status` returns 0, disabling
  the tripwire gracefully rather than crashing.

## Query intent → routing weights

| Intent     | α (vector) | β (BM25) | KG-first |
|------------|------------|----------|----------|
| Definition | 0.3        | 0.7      | false    |
| Usage      | 0.5        | 0.5      | **true** |
| Conceptual | 0.8        | 0.2      | false    |
| BugDebt    | 0.1        | 0.9      | false    |
| Unknown    | 0.6        | 0.4      | false    |

The classifier is a sub-ms regex over the query text. KG expansion is gated
to `Usage` intent only — caller/callee chains are scored at 70% of the
trigger chunk's RRF score.

## CLI

```bash
trusty-search start                                  # start HTTP daemon (background)
trusty-search start --data-dir <PATH>                # start with custom data dir (TRUSTY_DATA_DIR)
                                                     # enables isolated daemon instances; each instance
                                                     # gets its own data dir, port, and index registry
trusty-search start --no-auto-discover               # skip startup auto-discovery scan
                                                     # (also: TRUSTY_NO_AUTO_DISCOVER=1)
                                                     # daemon serves only already-registered indexes
trusty-search stop                                   # stop daemon (SIGTERM via PID lockfile)
trusty-search index [path] [--name <id>] [--force]   # register + index (primary command)
                                                     # auto-detects ./trusty-search.yaml
trusty-search query <text> [--index <id>] [--top-k N] [--json]
trusty-search status                                 # daemon + index overview (alias: health)
trusty-search doctor [--fix]                         # 6-check diagnostic + auto-repair
trusty-search ui [--port N]                          # open web management UI in browser
trusty-search convert project|all [--dry-run]        # migrate from mcp-vector-search
trusty-search serve [--http <addr>]                  # MCP stdio (default) or HTTP/SSE
# Aliases preserved for backward compatibility:
trusty-search init [path]                            # alias for index
trusty-search reindex [path]                         # alias for index --force
```

## MCP tools

The MCP server registers **18 tools** (authoritative source: `src/mcp/tools.rs`
`tool_definitions`):

| Tool            | Description                                          |
|-----------------|------------------------------------------------------|
| `search`        | Hybrid search (BM25 + HNSW + KG, RRF-fused)          |
| `search_kg`     | KG-first graph-walk search; accepts optional `refine_query` (see below) |
| `search_semantic` | Vector-only semantic search lane                   |
| `search_lexical`| BM25/token lexical search lane                       |
| `search_all`    | Fan-out search across every registered index         |
| `search_similar`| Code-to-code similarity from a seed file/function    |
| `index_file`    | Add or replace a single file in the index            |
| `remove_file`   | Remove a file and all its chunks                     |
| `list_indexes`  | Enumerate all registered indexes                     |
| `create_index`  | Register a new (empty) index                         |
| `delete_index`  | Drop an index from the registry                      |
| `reindex`       | Fire-and-forget full reindex (SSE progress)          |
| `index_status`  | Per-index stats including walk diagnostics (see below) |
| `list_chunks`   | Paginated enumeration of chunks `(file, start_line)` |
| `get_call_chain`| KG caller/callee chain for a symbol                  |
| `grep`          | Literal/regex grep fallback over the corpus          |
| `search_health` | Daemon liveness probe                                |
| `chat`          | OpenRouter Q&A with auto-injected search context     |

### `search_kg` — `refine_query` parameter (issue #147)

`search_kg` performs a graph-walk expanding the KG neighbourhood of each top
hit. When the seed chunk is a weak or wrong match, the unfiltered neighbourhood
can compound the error with unrelated results.

Pass an optional `refine_query` string to describe the target concept in
natural language. The daemon embeds both the `refine_query` and every
KG-expanded neighbour, then discards neighbours whose cosine similarity against
`refine_query` is below **0.4**. Surviving neighbours are re-ranked by cosine
score so the strongest semantic match appears first. Seeds from the primary
fused list are never filtered.

```json
{
  "tool": "search_kg",
  "index_id": "myproj",
  "query": "authenticate",
  "refine_query": "JWT token validation and expiry checking"
}
```

When `refine_query` is absent the behaviour is identical to the previous version
(fully backward-compatible).

### `index_status` — walk diagnostic fields (issue #280)

`GET /indexes/:id/status` (and the `index_status` MCP tool) now include four
fields that let operators diagnose why a reindex produced zero chunks:

| Field | Type | Description |
|-------|------|-------------|
| `last_walk_started_at` | `string \| null` | RFC 3339 timestamp of the most recent walk start |
| `last_walk_files_seen` | `number` | Files discovered by the walk (after gitignore/extension filtering) |
| `last_walk_files_skipped` | `number` | Directories skipped (gitignore, build artefacts, etc.) |
| `last_walk_error` | `string \| null` | Set when the walk found zero indexable files; describes probable cause |

These fields are populated every time a reindex task runs. On a healthy index
with chunks you will see `last_walk_error: null` and `last_walk_files_seen > 0`.

## Stack

| Component       | Choice                                              |
|-----------------|-----------------------------------------------------|
| Language        | Rust 2021                                           |
| Async runtime   | tokio (full features)                               |
| HTTP            | axum 0.7 + tower-http (CORS, trace, gzip), HTTP/2   |
| Vector store    | usearch 2.25 (HNSW, in-memory, `Arc<RwLock<>>`)     |
| Embeddings      | fastembed 5.x (ONNX, all-MiniLM-L6-v2 INT8, 384-dim)|
| Lexical         | Custom BM25 (zero-dep port, camelCase splitting)    |
| KV store        | redb 2.6                                            |
| Knowledge graph | petgraph 0.6 (`SymbolGraph`)                        |
| File watching   | notify 6 + notify-debouncer-mini 0.4 (500 ms)       |
| Code parsing    | tree-sitter 0.26 (14 grammars)                      |
| Concurrency     | dashmap 5, lru 0.12, rayon 1                        |
| HTTP client     | reqwest 0.12 (rustls-tls)                           |
| CLI             | clap 4 (derive)                                     |
| UI              | Svelte 5, embedded via `include_dir!`               |
| Hashing         | sha2 (incremental reindex fingerprints)             |

## Troubleshooting

**Daemon won't start**

Run `trusty-search doctor` for a 6-check diagnostic. Common causes:
- Another daemon already running: `trusty-search stop` then `trusty-search start`
- Stale PID lockfile: `trusty-search doctor --fix` removes it automatically
- Less than 16 GB RAM: the daemon performs a hard RAM check and exits with an actionable error. Set `TRUSTY_SKIP_RAM_CHECK=1` in the daemon environment to bypass for small workloads; not recommended on large corpora (risk of OOM during indexing)

**Embedder stuck on "initializing"**

The ONNX Runtime initializes the model on first start and may take 30–60 seconds on slower machines. If it hangs indefinitely, increase the timeout:

```bash
TRUSTY_EMBEDDER_INIT_TIMEOUT_SECS=120 trusty-search start
```

**High memory usage during reindex**

The daemon has a soft RSS ceiling (`TRUSTY_MEMORY_LIMIT_MB`). When hit, it skips remaining batches and logs a warning. Already-committed chunks stay searchable. To lower pressure:

```bash
TRUSTY_MEMORY_LIMIT_MB=2048 trusty-search start
```

Or wait for the soft cap to trip — the partial index is usable immediately.

**Reindex produced zero chunks**

If `index_status` shows `chunk_count: 0` after a reindex, check the walk
diagnostic fields:

```bash
# Via CLI (pipe through jq if available)
trusty-search status --index myproj

# Via HTTP
curl http://127.0.0.1:<port>/indexes/myproj/status | jq .
```

Look for `last_walk_error`. Common causes and fixes:

| `last_walk_error` message | Cause | Fix |
|--------------------------|-------|-----|
| `root path does not exist: /…` | Index was registered with a path that no longer exists | Re-register with the correct path: `trusty-search index /new/path --name myproj` |
| `walk produced zero files … check gitignore rules` | All discovered files were excluded by `.gitignore`, extension allow-list, or `path_filter` | Check `.gitignore` for overly broad rules; ensure at least one supported extension (`.rs`, `.py`, `.ts`, etc.) exists under the root path |

If `last_walk_error` is `null` but `chunk_count` is still 0, the walk found
files but the chunker produced no output — this usually means all files are
binary or exceed the size limit. Check `RUST_LOG=debug trusty-search start` for
per-file warnings.

**Port conflict**

The daemon auto-selects a free port on each start. The live port is written to:
- macOS: `~/Library/Application Support/trusty-search/port.lock`
- Linux: `$XDG_DATA_HOME/trusty-search/port.lock`

If `trusty-search status` reports the wrong port, stop and restart the daemon.

**Device flag not persisting across restarts**

Use `trusty-search start --device cpu` to force CPU mode. The flag is persisted to `daemon.env` so it survives daemon restarts.

## Architecture and HTTP API

See [CLAUDE.md](./CLAUDE.md) for the full HTTP endpoint catalogue, query
pipeline, multi-request design, memory tuning reference, and release process.

## Documentation

- [CLAUDE.md]./CLAUDE.md — full architecture + HTTP API reference
- [CHANGELOG.md]./CHANGELOG.md — release history
- [docs/trusty-search/examples/trusty-search.yaml]../../docs/trusty-search/examples/trusty-search.yaml — multi-index repo config
- [docs/trusty-search/research/]../../docs/trusty-search/research/ — design + comparison documents

## License

[Elastic License 2.0 (ELv2)](./LICENSE) — free for internal use; you may not
provide trusty-search as a hosted or managed service to third parties without
a commercial agreement. See [LICENSE](./LICENSE) for the full terms.