Argyph
The local-first MCP server for serious codebases. Zero config, zero cloud, full context.
Argyph is a single MCP server that gives any AI coding agent fast, structured, and semantic context over a codebase. It runs entirely on your machine, indexes incrementally, and is ready in under a second on previously-indexed repos.
The name is a portmanteau of Argus (the hundred-eyed watcher of Greek myth) and Glyph (a carved symbol with bound meaning) — a server that watches a codebase and gives the agent its symbols.
What it does
Argyph replaces the half-dozen MCP servers most developers stitch together (one for grep, one for embeddings, one for symbol search, one for repo packing) with a single tool. It exposes three pillars of context behind one MCP endpoint:
- Ask-first retrieval —
askis the primary agent-facing lookup tool. It routes bare identifiers to symbol search, structured locators tolocate, and natural-language questions to hybrid search, returning boundedSpanresults instead of whole files. - File and symbol intelligence — a tree-sitter-driven symbol graph with
find_definition,find_references, callers, callees, imports, and outline tools. Structural queries return in milliseconds. - Semantic search — hybrid (BM25 + vector) search over AST-aware chunks, backed by an embedded LanceDB store. Bundled local embedding model means no API key is required for full functionality.
- Repo packing — token-budgeted, repomix-style flattening of a repo or subset for agents that need to absorb a codebase quickly.
Everything is read-only. Argyph never edits, commits, or executes code.
Why local-first
Most existing context servers depend on a cloud vector database (Milvus, Pinecone) and a remote embedding API. That's a non-starter for proprietary code at most companies, and a tax on cold starts everywhere else. Argyph runs entirely on the developer's machine: a single binary, an embedded vector store, an optional bundled embedding model, no daemon, no account, no key required to get full functionality.
Install
The first tagged release is
v1.0.0-rc.1. If you don't see a published package on a given channel yet, use the Build from source path below — it works on any platform with a current Rust toolchain.
Claude Code
npm / npx
Homebrew (macOS / Linux)
Universal installer
|
Intel Mac (
x86_64-apple-darwin) users: the ONNX Runtime backend doesn't ship a prebuilt for Intel macOS, so no prebuilt binary is produced for that target. Use the Cargo path below — it builds in a couple of minutes and works fine on Intel Macs.
Cargo
cargo install argyph builds Argyph from source. The bundled local
embedder uses ort (ONNX Runtime); on
most platforms ort ships a prebuilt dynamic library and there is
nothing to do. On Linux you may need libssl-dev and a working C
toolchain; on Windows the MSVC build tools are required. If the build
fails on ort-sys linkage, set ORT_STRATEGY=download before
re-running.
Build from source
Claude Desktop (DXT)
Download argyph.dxt from the latest release and double-click.
Quick start
# In any repo
In the chat:
What does this codebase do, and where is session expiration controlled?
Argyph indexes Tier 0 in under a second on first run, Tier 1 (symbol graph) in seconds, and Tier 2 (embeddings) in the background. You can query immediately — tools return what's available now plus an index_coverage field so the agent knows.
How it works
Argyph builds the index in three tiers, each useful before the next completes:
| Tier | What it builds | Time on a 1M-LOC repo | Useful for |
|---|---|---|---|
| 0 | File inventory, hashes, .gitignore-aware tree | <1 s | Tree views, ripgrep, packing |
| 1 | Symbol graph (defs, refs, calls, imports) | ~30 s | Go-to-def, find-references, call graphs |
| 1.5 | Structural index (non-code: md, json, yaml…) | seconds (after Tier 1) | locate for structured non-code files |
| 2 | Embeddings + hybrid index | minutes (background) | Fuzzy semantic queries |
The 80/20 insight: most agent queries are structural (where is parseConfig defined?, what calls validateUser?) and don't need embeddings. Argyph serves those from Tier 1 in milliseconds, even while Tier 2 is still building.
After the first index, the on-disk .argyph/ directory persists and only changed files are reprocessed on subsequent runs. A filesystem watcher keeps everything live.
Tools
| Tool | Description | Tier required |
|---|---|---|
ask |
Primary lookup router returning bounded spans | 0/1/1.5/2 |
get_index_status |
Tier readiness, embedding progress, watcher state | 0 |
get_repo_overview |
Languages, entry points, README excerpt, tree | 0 |
search_text |
Ripgrep-style regex / literal search | 0 |
find_definition |
Locate the definition of a named symbol | 1 |
find_references |
Reference sites with surrounding context | 1 |
get_callers |
Functions that call a given function | 1 |
get_callees |
Functions a given function calls | 1 |
get_imports |
Imports of a file, and files that import it | 1 |
get_symbol_outline |
Hierarchical outline of a file | 1 |
search_semantic |
Hybrid BM25 + vector over AST-aware chunks | 2 |
pack_repo |
Token-budgeted repo flattening (XML or markdown) | 0+1 |
locate |
Smallest natural span containing the target | 1.5 |
locate_smart |
Retrieval subagent (opt-in; needs provider config) | 1.5 |
expand_span |
Expand one truncated span from a session handle | 0 |
memory_save |
Persist a memory entry under a scope | 0 |
memory_search |
FTS5 search over persistent memories | 0 |
memory_list |
List memories in a scope | 0 |
memory_forget |
Delete a memory entry by id | 0 |
Full schema reference: docs/tools-reference.md.
Optional features
locate_smart — retrieval subagent
locate_smart is an opt-in tool that runs a bounded multi-step retrieval loop using an LLM provider. It's disabled by default; enable it in .argyph/config.toml:
[]
= true
= "openai" # or "anthropic" | "ollama"
= "gpt-4o-mini"
# endpoint = "http://localhost:11434" # only for local providers
Or via env:
ARGYPH_LOCATE_SMART_ENABLED=1
ARGYPH_LOCATE_SMART_PROVIDER=anthropic
ARGYPH_LOCATE_SMART_MODEL=claude-haiku-4-5
When disabled, calls to locate_smart return LOCATE_SMART_DISABLED immediately and no provider keys are required.
Build with the smart feature:
Configuration
Config is layered (highest priority first): env vars, .argyph/config.toml in the repo, built-in defaults. A config file is never required.
ARGYPH_LOG=info
ARGYPH_EMBED_PROVIDER=local # local | openai | voyage
OPENAI_API_KEY=... # standard provider env vars
ARGYPH_DISABLE_WATCHER=true # for sandboxed environments
Install agent lookup instructions in CLAUDE.md, AGENTS.md, or GEMINI.md:
Why Argyph (vs alternatives)
| Tool | Symbol graph | Semantic search | Local-first | Single install | Incremental |
|---|---|---|---|---|---|
| claude-context (Zilliz) | yes | yes | yes | ||
| GitNexus / CodeGraphContext | yes | ||||
| repomix | yes | yes | |||
| Serena | yes | yes | yes | ||
| Argyph | yes | yes | yes | yes | yes |
Benchmarks
Reproducible numbers, methodology in docs/benchmarks.md. Reproduce locally with cargo bench --workspace. Numbers below are the median of three runs on the reference hardware tagged m3-pro (Apple M3 Pro, 36 GB RAM, macOS 15) unless stated otherwise.
Hot paths (criterion, mean times)
| Bench | Mean | What it measures |
|---|---|---|
locate_parse_path_bare |
~15 ns | Parsing a bare path locator |
locate_parse_path_heading |
~18 ns | Parsing path > Heading locators |
locate_strategy_path_only |
~19 ns | Strategy dispatch for path-only locator |
locate_strategy_scoped |
~32 ns | Strategy dispatch for scoped query locator |
token_count_rust_file |
~4.4 µs | cl100k_base tokenization of one source file |
walk_project_root |
~244 ms | Full walkdir traversal of this repo |
System-level
End-to-end timings on an Apple M-series laptop, macOS 15. Reproduce via:
| Fixture | Files | LOC | Tier 0 cold | Tier 1 full |
|---|---|---|---|---|
BurntSushi/ripgrep |
215 | ~52K | 71 ms | 6.8 s |
microsoft/TypeScript (src/) |
709 | ~452K | 30 ms | 8.2 s |
microsoft/TypeScript (whole repo) |
81,310 | ~2M | 2.1 s | see note ↓ |
Tier 0 — the "useful immediately" gate — scales linearly and fast
(81K files in 2.1 s). search_text and pack_repo are available the
moment Tier 0 is ready.
Tier 1 (the symbol graph) is fast on normal-to-large repos. Two
optimizations landed for v1.0: the within-file reference resolver was
rewritten from an O(symbols²) substring scan to one-pass hash-indexed
lookups (4.2× — TypeScript compiler source: 34.6 s → 8.2 s), and the
per-file symbol/chunk SQL writes are now batched. On very large
monorepos (the full 81K-file TypeScript repo, ~2M LOC) Tier 1 still
does not finish within 30 min — the residual cost is raw tree-sitter
parse volume across 79K files, which needs streaming/parallel indexing
(tracked in ROADMAP.md). The server never blocks at any
scale: Tier-1 tools return INDEX_NOT_READY with a retry hint until
the graph is ready. Full data: docs/benchmarks.md.
Architecture
Argyph is a Rust workspace of twelve focused crates with strict module ownership. The full architecture, including the Supervisor lifecycle, the three-tier indexing model, and per-crate responsibility boundaries, is documented in ARCHITECTURE.md.
Project status
Release candidate. v1.0.0-rc.1 is published; v1.0.0 will cut once the system-level benchmark targets in docs/SPEC.md § 6 are verified on the reference hardware. The build plan and milestones are in docs/BUILD_PLAN.md; the current milestone is tracked at the top of ROADMAP.md.
Contributing
Argyph is built with substantial AI assistance, but human-architected and human-reviewed. Contribution guide and the (strict) AI agent rules are in CONTRIBUTING.md. Commit conventions, including the project's attribution policy, are in docs/COMMIT_CONVENTIONS.md.
Author
Built by Ezzy1630. See AUTHORS.md.
License
Dual-licensed under either of:
- MIT License (LICENSE-MIT)
- Apache License 2.0 (LICENSE-APACHE)
at your option.