# Argyph
> The local-first MCP server for serious codebases.
> Zero config, zero cloud, full context.
[](https://github.com/Ezzy1630/argyph/actions/workflows/ci.yml)
[](https://crates.io/crates/argyph)
[](https://www.npmjs.com/package/argyph)
[](#license)
Argyph is a single MCP server that gives any AI coding agent fast, structured, and semantic context over a codebase. It runs entirely on your machine, indexes incrementally, and is ready in under a second on previously-indexed repos.
The name is a portmanteau of **Argus** (the hundred-eyed watcher of Greek myth) and **Glyph** (a carved symbol with bound meaning) — a server that watches a codebase and gives the agent its symbols.
---
## What it does
Argyph replaces the half-dozen MCP servers most developers stitch together (one for grep, one for embeddings, one for symbol search, one for repo packing) with a single tool. It exposes three pillars of context behind one MCP endpoint:
0. **Ask-first retrieval** — `ask` is the primary agent-facing lookup tool. It routes bare identifiers to symbol search, structured locators to `locate`, and natural-language questions to hybrid search, returning bounded `Span` results instead of whole files.
1. **File and symbol intelligence** — a tree-sitter-driven symbol graph with `find_definition`, `find_references`, callers, callees, imports, and outline tools. Structural queries return in milliseconds.
2. **Semantic search** — hybrid (BM25 + vector) search over AST-aware chunks, backed by an embedded LanceDB store. Bundled local embedding model means no API key is required for full functionality.
3. **Repo packing** — token-budgeted, repomix-style flattening of a repo or subset for agents that need to absorb a codebase quickly.
Everything is read-only. Argyph never edits, commits, or executes code.
---
## Why local-first
Most existing context servers depend on a cloud vector database (Milvus, Pinecone) and a remote embedding API. That's a non-starter for proprietary code at most companies, and a tax on cold starts everywhere else. Argyph runs entirely on the developer's machine: a single binary, an embedded vector store, an optional bundled embedding model, no daemon, no account, no key required to get full functionality.
---
## Install
> The first tagged release is `v1.0.0-rc.1`. If you don't see a published
> package on a given channel yet, use the **Build from source** path below —
> it works on any platform with a current Rust toolchain.
### Claude Code
```bash
claude mcp add argyph -- npx argyph@latest
```
### npm / npx
```bash
npx argyph
```
### Homebrew (macOS / Linux)
```bash
brew install Ezzy1630/argyph/argyph
```
### Universal installer
```bash
> **Intel Mac (`x86_64-apple-darwin`) users:** the ONNX Runtime backend
> doesn't ship a prebuilt for Intel macOS, so no prebuilt binary is
> produced for that target. Use the **Cargo** path below — it builds
> in a couple of minutes and works fine on Intel Macs.
### Cargo
`cargo install argyph` builds Argyph from source. The bundled local
embedder uses [`ort`](https://crates.io/crates/ort) (ONNX Runtime); on
most platforms `ort` ships a prebuilt dynamic library and there is
nothing to do. On Linux you may need `libssl-dev` and a working C
toolchain; on Windows the MSVC build tools are required. If the build
fails on `ort-sys` linkage, set `ORT_STRATEGY=download` before
re-running.
```bash
cargo install argyph --locked
```
### Build from source
```bash
git clone https://github.com/Ezzy1630/argyph.git
cd argyph
cargo build --release
./target/release/argyph serve
```
### Claude Desktop (DXT)
Download `argyph.dxt` from the [latest release](https://github.com/Ezzy1630/argyph/releases/latest) and double-click.
---
## Quick start
```bash
# In any repo
cd ~/code/your-repo
argyph init
claude mcp add argyph -- npx argyph@latest
claude
```
In the chat:
> What does this codebase do, and where is session expiration controlled?
Argyph indexes Tier 0 in under a second on first run, Tier 1 (symbol graph) in seconds, and Tier 2 (embeddings) in the background. You can query immediately — tools return what's available now plus an `index_coverage` field so the agent knows.
---
## How it works

Argyph builds the index in three tiers, each useful before the next completes:
| 0 | File inventory, hashes, .gitignore-aware tree | <1 s | Tree views, ripgrep, packing |
| 1 | Symbol graph (defs, refs, calls, imports) | ~30 s | Go-to-def, find-references, call graphs |
| 1.5 | Structural index (non-code: md, json, yaml…) | seconds (after Tier 1) | `locate` for structured non-code files |
| 2 | Embeddings + hybrid index | minutes (background) | Fuzzy semantic queries |
The 80/20 insight: most agent queries are structural (`where is parseConfig defined?`, `what calls validateUser?`) and don't need embeddings. Argyph serves those from Tier 1 in milliseconds, even while Tier 2 is still building.
After the first index, the on-disk `.argyph/` directory persists and only changed files are reprocessed on subsequent runs. A filesystem watcher keeps everything live.
---
## Tools
| `ask` | Primary lookup router returning bounded spans | 0/1/1.5/2 |
| `get_index_status` | Tier readiness, embedding progress, watcher state | 0 |
| `get_repo_overview` | Languages, entry points, README excerpt, tree | 0 |
| `search_text` | Ripgrep-style regex / literal search | 0 |
| `find_definition` | Locate the definition of a named symbol | 1 |
| `find_references` | Reference sites with surrounding context | 1 |
| `get_callers` | Functions that call a given function | 1 |
| `get_callees` | Functions a given function calls | 1 |
| `get_imports` | Imports of a file, and files that import it | 1 |
| `get_symbol_outline` | Hierarchical outline of a file | 1 |
| `search_semantic` | Hybrid BM25 + vector over AST-aware chunks | 2 |
| `pack_repo` | Token-budgeted repo flattening (XML or markdown) | 0+1 |
| `locate` | Smallest natural span containing the target | 1.5 |
| `locate_smart` | Retrieval subagent (opt-in; needs provider config) | 1.5 |
| `expand_span` | Expand one truncated span from a session handle | 0 |
| `memory_save` | Persist a memory entry under a scope | 0 |
| `memory_search` | FTS5 search over persistent memories | 0 |
| `memory_list` | List memories in a scope | 0 |
| `memory_forget` | Delete a memory entry by id | 0 |
Full schema reference: [docs/tools-reference.md](docs/tools-reference.md).
---
## Optional features
### `locate_smart` — retrieval subagent
`locate_smart` is an opt-in tool that runs a bounded multi-step retrieval loop using an LLM provider. It's **disabled by default**; enable it in `.argyph/config.toml`:
```toml
[locate_smart]
enabled = true
# endpoint = "http://localhost:11434" # only for local providers
```
Or via env:
```bash
ARGYPH_LOCATE_SMART_ENABLED=1
ARGYPH_LOCATE_SMART_PROVIDER=anthropic
ARGYPH_LOCATE_SMART_MODEL=claude-haiku-4-5
```
When disabled, calls to `locate_smart` return `LOCATE_SMART_DISABLED` immediately and no provider keys are required.
Build with the smart feature:
```bash
cargo install argyph --features smart
```
---
## Configuration
Config is layered (highest priority first): env vars, `.argyph/config.toml` in the repo, built-in defaults. A config file is never required.
```bash
ARGYPH_LOG=info
ARGYPH_DISABLE_WATCHER=true # for sandboxed environments
```
Install agent lookup instructions in `CLAUDE.md`, `AGENTS.md`, or `GEMINI.md`:
```bash
argyph init
```
---
## Why Argyph (vs alternatives)
| claude-context (Zilliz) | | yes | | yes | yes |
| GitNexus / CodeGraphContext| yes | | | | |
| repomix | | | yes | yes | |
| Serena | yes | | yes | yes | |
| **Argyph** | **yes** | **yes** | **yes** | **yes** | **yes** |
---
## Benchmarks
Reproducible numbers, methodology in [`docs/benchmarks.md`](docs/benchmarks.md). Reproduce locally with `cargo bench --workspace`. Numbers below are the median of three runs on the reference hardware tagged `m3-pro` (Apple M3 Pro, 36 GB RAM, macOS 15) unless stated otherwise.
### Hot paths (criterion, mean times)
| `locate_parse_path_bare` | ~15 ns | Parsing a bare path locator |
| `locate_parse_path_heading` | ~18 ns | Parsing `path > Heading` locators |
| `locate_strategy_path_only` | ~19 ns | Strategy dispatch for path-only locator |
| `locate_strategy_scoped` | ~32 ns | Strategy dispatch for scoped query locator |
| `token_count_rust_file` | ~4.4 µs | `cl100k_base` tokenization of one source file |
| `walk_project_root` | ~244 ms | Full `walkdir` traversal of this repo |
### System-level
End-to-end timings on an Apple M-series laptop, macOS 15. Reproduce via:
```bash
cargo run --release -p argyph-benches --bin system_bench -- /path/to/repo
```
| `BurntSushi/ripgrep` | 215 | ~52K | **71 ms**| **6.8 s** |
| `microsoft/TypeScript` (`src/`) | 709 | ~452K | **30 ms**| **8.2 s** |
| `microsoft/TypeScript` (whole repo) | 81,310 | ~2M | **2.1 s** | see note ↓ |
**Tier 0 — the "useful immediately" gate — scales linearly and fast**
(81K files in 2.1 s). `search_text` and `pack_repo` are available the
moment Tier 0 is ready.
**Tier 1 (the symbol graph) is fast on normal-to-large repos.** Two
optimizations landed for v1.0: the within-file reference resolver was
rewritten from an O(symbols²) substring scan to one-pass hash-indexed
lookups (4.2× — TypeScript compiler source: 34.6 s → 8.2 s), and the
per-file symbol/chunk SQL writes are now batched. On *very* large
monorepos (the full 81K-file TypeScript repo, ~2M LOC) Tier 1 still
does not finish within 30 min — the residual cost is raw tree-sitter
parse volume across 79K files, which needs streaming/parallel indexing
(tracked in [ROADMAP.md](ROADMAP.md)). The server never blocks at any
scale: Tier-1 tools return `INDEX_NOT_READY` with a retry hint until
the graph is ready. Full data: [`docs/benchmarks.md`](docs/benchmarks.md).
---
## Architecture
Argyph is a Rust workspace of twelve focused crates with strict module ownership. The full architecture, including the Supervisor lifecycle, the three-tier indexing model, and per-crate responsibility boundaries, is documented in [ARCHITECTURE.md](ARCHITECTURE.md).
---
## Project status
**Release candidate.** `v1.0.0-rc.1` is published; `v1.0.0` will cut once the system-level benchmark targets in [`docs/SPEC.md`](docs/SPEC.md) § 6 are verified on the reference hardware. The build plan and milestones are in [`docs/BUILD_PLAN.md`](docs/BUILD_PLAN.md); the current milestone is tracked at the top of [ROADMAP.md](ROADMAP.md).
---
## Contributing
Argyph is built with substantial AI assistance, but human-architected and human-reviewed. Contribution guide and the (strict) AI agent rules are in [CONTRIBUTING.md](CONTRIBUTING.md). Commit conventions, including the project's attribution policy, are in [`docs/COMMIT_CONVENTIONS.md`](docs/COMMIT_CONVENTIONS.md).
---
## Author
Built by [Ezzy1630](https://github.com/Ezzy1630). See [AUTHORS.md](AUTHORS.md).
---
## License
Dual-licensed under either of:
- MIT License ([LICENSE-MIT](LICENSE-MIT))
- Apache License 2.0 ([LICENSE-APACHE](LICENSE-APACHE))
at your option.