docs.rs failed to build basemind-0.5.0
Please check the build logs for more information.
See Builds for ideas on how to fix a failed build, or Metadata for how to configure docs.rs builds.
If you believe this is docs.rs' fault, open an issue.

basemind

The context and communication layer for coding agents. basemind is the shared brain a team of AI coding agents works from. It turns any repository into an always-current understanding of the code, documents, history, and memory an agent needs — and gives multiple agents a way to talk to each other and coordinate while they work.

The payoff is twofold: each agent reasons from structure and search instead of burning its limited context window on grep and file reads, and a team of agents stays in sync instead of stepping on each other's work. One server does both.

Code map & search across 300+ languages · document processing for 90+ file formats · semantic + full-text search · git history & blame · shared agent memory · on-demand web crawl · agent-to-agent comms

Capabilities · Architecture · Tools · Quickstart · Performance · Install

Capabilities

Four pillars give an agent context; a fifth lets agents coordinate.

Code — Tree-sitter outlines, symbol search, reference + caller + implementation graphs, call chains, git history per symbol, blame at symbol-level resolution.

Documents — Ingest + semantic search over PDFs, Office (Word/Excel/iWork), HTML, email, archives. Built-in OCR, layout detection, keyword + NER extraction, cross-encoder reranking. All ONNX bundled — no system install needed.

Memory — Per-repo scoped key-value + semantic vector storage, split into a shared group tier and a per-agent individual tier. Clones of the same git origin automatically share memory; unrelated repos isolated.

Web — On-demand HTTP scrape + follow-link crawl. Pages chunk, embed, and land in the documents store under scope web:<host> for unified search.

Coordination — A user-global broker daemon hosts scoped chat rooms and a per-agent inbox, so multiple agents working the same code (across harnesses and repos) leave each other status, ask questions, and avoid collisions. See Agent coordination.

Context economy

basemind tools return paths, line numbers, and signatures — not file bodies — so a structural answer costs a fraction of the tokens of reading source. The plugin ships this as the agent's default operating discipline (carried in the MCP server instructions, the basemind skill, and the SessionStart hook):

outline a file before opening it — then read only the span you need.
search_symbols instead of grep/rg for a definition.
find_references / find_callers instead of grepping call sites.
workspace_grep instead of shelling out to ripgrep for regex over content.
rescan after edits instead of reconnecting the server.
Don't re-read a file basemind already mapped.

The plugin also ships a PreToolUse guard hook that reaches the agent at the moment it reaches for search: by default (BASEMIND_GUARD=nudge) it points Grep/Glob calls at the matching basemind tool, once per session. Set BASEMIND_GUARD=redirect to enforce it (the call is blocked with a pointer to the basemind tool) or BASEMIND_GUARD=off to disable.

The live statusline surfaces the payoff: estimated tokens saved vs a grep + read baseline.

Architecture

One basemind scan walks the working tree in parallel (rayon), extracts structure with tree-sitter and documents with the kreuzberg pipeline, and writes everything into a content-addressed store under .basemind/: msgpack blobs (deduped by content hash), a Fjall LSM inverted index for symbol/reference/caller lookups, and a LanceDB vector store for document + memory search. basemind serve preloads the outlines into RAM and answers MCP/CLI tool calls straight from the index — no disk scan per query.

flowchart LR
  AGENT(["Coding agent"])
  subgraph repo["Your repository"]
    SRC["Source<br/>300+ languages"]
    DOC["Documents<br/>90+ formats"]
    GIT["Git history"]
  end
  subgraph scan["basemind scan · rayon-parallel"]
    EXT["tree-sitter extract<br/>L1 outline · L2 calls · L3 hash"]
    KZ["kreuzberg<br/>OCR · NER · chunk · embed"]
  end
  subgraph store[".basemind/ · content-addressed"]
    BLOB["msgpack blob store"]
    IDX["Fjall LSM<br/>inverted index"]
    VEC["LanceDB vectors<br/>documents · memory"]
  end
  subgraph serve["basemind serve · MCP + CLI"]
    T1["code + git tools"]
    T2["document + memory search"]
    T3["web crawl"]
  end
  SRC --> EXT
  DOC --> KZ
  EXT --> BLOB
  EXT --> IDX
  KZ --> VEC
  T3 --> VEC
  BLOB --> T1
  IDX --> T1
  GIT --> T1
  VEC --> T2
  AGENT <-->|tool calls| serve

Agent coordination

basemind is also a communication substrate for multiple agents working the same code at once — across harnesses and across repos in a shared workspace. A singleton, user-global broker daemon (its own Fjall store over a Unix socket, independent of any repo's exclusive index lock) hosts scoped rooms: an agent auto-joins every room whose scope covers it — the repo's git remote, a path prefix, or global. Messages are two-tier — a front-matter envelope (subject · from · id) that room_history / inbox_read scan cheaply, and a body fetched on demand by message_get — so scanning a busy room costs almost nothing. The broker excludes an agent's own posts from its inbox, so notifications never echo back.

The plugin delivers comms three ways, so an agent notices traffic without being asked: the MCP instructions + basemind-comms skill tell it the tools exist and to post status as it works; SessionStart / UserPromptSubmit hooks inject unread front-matter on boot and per turn; and a background monitor (~15 s) surfaces new messages while the agent is working or idle.

flowchart TB
  subgraph agents["Coding agents · multiple harnesses · multiple repos"]
    A["Agent A<br/>Claude Code · repo X"]
    B["Agent B<br/>Cursor · repo Y · same workspace"]
  end
  subgraph delivery["Per-session delivery (plugin)"]
    INSTR["MCP instructions +<br/>basemind-comms skill"]
    HOOKS["SessionStart + UserPromptSubmit hooks"]
    MON["Background monitor · ~15s"]
  end
  subgraph daemon["Broker daemon · singleton · user-global"]
    BR["Broker<br/>scope auto-join · fan-out · self-exclude"]
    REG["Room registry<br/>scope: remote · path · global"]
    CS["CommsStore · Fjall over UDS<br/>rooms · front-matter · bodies · cursors"]
  end
  A --> delivery
  B --> delivery
  delivery -->|room_post · room_history<br/>inbox_read · message_get| BR
  BR --> REG
  BR --> CS
  CS -. unread .-> HOOKS
  CS -. new messages .-> MON
  HOOKS -. inject .-> A
  MON -. notify .-> A

Feature table

Pillar	What it does	MCP tools	Backend
Code intelligence	Outlines, symbol search (substring), call-site lookup (substring), call graphs, impl lookup (substring), dependents, in-tree regex	`outline`, `search_symbols`, `workspace_grep`, `find_references`, `find_callers`, `call_graph`, `find_implementations`, `dependents`, `list_files`, `status`, `repo_info`	tree-sitter × 300+ langs · Fjall LSM index · content-addressed blob store
Git intelligence	Symbol-level history, blame, churn, recent changes, structural diffs across revs	`symbol_history`, `blame_file`, `blame_symbol`, `hot_files`, `recent_changes`, `commits_touching`, `find_commits_by_path`, `diff_outline`, `diff_file`, `working_tree_status`	gix + sha-keyed disk cache
Document RAG	Ingest + semantic search over 90+ file formats — PDFs, Office (Excel/Word/HWP/iWork), HTML, XML, email, archives, images. Adds OCR (Tesseract + PaddleOCR), cross-encoder reranker, keyword extraction (YAKE/RAKE), NER (gline-rs ONNX + LLM), extractive + abstractive summarization, layout detection, page auto-rotate, redaction, language detection. All ONNX models bundled — no system install needed.	`search_documents`	kreuzberg + LanceDB
Shared memory	Per-repo scoped key-value + semantic memory. Clones of the same git origin URL automatically share memory; unrelated repos isolated.	`memory_put`, `memory_get`, `memory_list`, `memory_search`, `memory_delete`	LanceDB + Fjall, scope-keyed
Web crawl	On-demand HTTP scrape + link-following crawl. Crawled pages route through the documents pipeline (chunk → embed → LanceDB) under scope `web:<host>`.	`web_scrape`, `web_crawl`, `web_map`	kreuzcrawl (native HTTP, no chromium)
Agent comms	Multi-agent messaging via a user-global broker daemon: scope-auto-joined rooms (git remote / path / global), per-agent inbox, two-tier messages (front-matter scan + lazy body fetch), self-posts excluded from inbox. Delivered across harnesses via MCP instructions + the `basemind-comms` skill, SessionStart / per-turn hooks, and a ~15 s background monitor.	`agent_register`, `agent_list`, `room_create`, `room_list`, `room_join`, `room_leave`, `room_post`, `room_history`, `message_get`, `inbox_read`	Fjall broker over a Unix socket
Admin	Live rescan, telemetry dashboard, cache introspection + GC + cleanup	`rescan`, `telemetry_summary`, `cache_stats`, `cache_gc`, `cache_clear`	—

Quickstart

Choose the path that fits your workflow. Both paths use the same on-disk index at .basemind/.

Path A: MCP plugin (Claude Code and other harnesses)

MCP (Model Context Protocol) runs the basemind server in-process and exposes all tools as in-session function calls. Zero config — install and start using tools immediately.

Claude Code

Run these two commands in order:

/plugin marketplace add Goldziher/basemind   # 1. register the marketplace
/plugin install basemind@basemind            # 2. install the plugin

Restart the session after installing. The basemind binary installs automatically on first use (via npx, uvx, or direct download with verified checksums) — no manual cargo install needed. Prebuilt binaries ship with the full feature set enabled (96 document formats, OCR, embeddings, reranker, semantic search, web crawl, shared memory), so first use downloads ML models over the network; binaries are larger as a result.

To enable the optional live statusline (showing context % and per-capability metrics), run /bm-statusline once. This is a one-time opt-in because Claude Code plugins cannot set the main statusline — it is a platform limitation. See the Statusline section for details.

Any MCP client (Cursor, Codex, Gemini, OpenCode, Continue, Cline, etc.)

cargo install basemind --features full --locked

Add to your MCP config:

{
  "mcpServers": {
    "basemind": {
      "command": "basemind",
      "args": ["serve"]
    }
  }
}

Each harness has setup instructions in the Installation section.

Path B: CLI + skill (scriptable, headless, CI)

Use the standalone basemind CLI binary and the basemind-cli skill for query-driven exploration. Same index, same tools, different interface — faster for scripting and batch operations.

# Install the binary
npm install -g basemind    # or: pip install basemind, cargo install basemind, brew install Goldziher/tap/basemind
basemind scan               # index the working tree once

Then use the CLI:

basemind query outline path/file.rs           # inspect file structure
basemind query symbol "parseQuery"            # find symbol by name
basemind query references "processFile"       # find all call sites
basemind git blame-file src/main.rs           # show per-line blame
basemind cache stats                          # cache stats
basemind cache gc                             # reclaim orphaned blobs
basemind watch --no-serve                     # live re-index on file change (no MCP server)

Add the basemind-cli skill to route CLI commands efficiently. See the CLI command reference below for the full command surface.

MCP vs CLI

Both paths share the same .basemind/ index and are safe to run alongside each other (the CLI opens the index read-only; basemind serve watches and incrementally updates in the background).

MCP: Wired as in-session tool calls. Zero config. Best for interactive agent workflows.
CLI: Scriptable, headless, CI-friendly. Best for batch queries, integration into non-MCP harnesses, and when you want to control the tool routing explicitly.

The choice is not binary — use MCP for interactive sessions and CLI for scripting in the same repo.

Statusline

To enable the live statusline in Claude Code (MCP only), run /bm-statusline once. This is a one-time opt-in because Claude Code plugins cannot set the main statusline — it is a platform limitation, not a basemind choice:

The plugin manifest has no statusLine field.
A plugin-shipped settings.json honors only agent and subagentStatusLine; any statusLine key is ignored.
Hooks communicate via stdout/stderr only — they cannot write to ~/.claude/settings.json.

/bm-statusline works because Claude (the agent) performs the settings edit on your behalf, writing an absolute path into ~/.claude/settings.json. After that it persists across sessions.

It renders two lines — a context line (model · output-style · dir · branch · context%) and the basemind line below it:

Opus · basemind · ⎇ main · 12% ctx
◆ basemind  ●  1,247 files · 23m ago  │  312 calls · 180 srch · 44 git · 12 docs  │  1.4M saved  │  ✉ 3 @reviewer

The state dot is green (serve active / scan < 1 h), amber (idle or scan 1–24 h), or red (no serve and stale index). The second segment breaks activity down per capability — searches, git, docs, memory, web — showing only the buckets with calls today; then estimated tokens saved. When the agent-comms broker is running, a final ✉ segment shows your unread message count (bright when non-zero) and, in the full tier, your agent identity. Layout adapts to terminal width ($COLUMNS): the per-capability breakdown drops on narrow terminals. Override with BASEMIND_STATUSLINE=full|compact|minimal (default auto) or hide the context line with BASEMIND_STATUSLINE_CONTEXT=0.

Why basemind, specifically

vs grep / ripgrep

What ripgrep does well: blazing-fast line matching. What it misses:

Grep returns 50+ hits in docs, tests, comments, variable names — agent wastes context filtering noise.
No scope awareness: parseQuery() and parseQuery string both match; semantic signals lost.
Every query re-scans the disk; no pre-computed structures to leverage.

basemind: semantic-quality answers at grep speed via tree-sitter + indexed call sites.

vs vector-only RAG (LangChain / LlamaIndex DIY stacks)

What vector RAG does well: fuzzy document semantic search. What it misses:

Pure embeddings lose exact structure — which function calls which, which class implements which interface.
No line/column resolution — agent can't map vector hits back to code symbols.
No git history integration — "what changed recently?" and "who wrote this?" require separate systems.

basemind: code structure + git history + vector memory + document search all in one, unified scope.

vs context7 / openai-codex / Aider's repo-map

What these do well: generate code-map summaries. What they miss:

Static snapshots — stale after the first edit.
No semantic indexing — every lookup re-parses or re-scans.
Human-focused output (markdown) instead of agent-facing structure (JSON tools).

basemind: live-updated index with sub-millisecond MCP tools, built for agents not humans.

vs GitHub native search

What GitHub does well: repository-wide fuzzy text search. What it misses:

Cloud-only — your code leaves the machine, latency is network-bound.
No local-editor integration — agent can't query in-progress edits before commit.
No cross-language polyglot support — each language's search tuned separately.

basemind: local-only, always-fresh index of your working tree, 300+ languages in one sweep.

Performance

Measured on Apple Silicon, release build, --features full, default eager_l2 = true. Cold filesystem cache adds ~50% to first scan; numbers below are warm steady-state.

Scan throughput

Repo	Files	Language mix	Time
tokio	859	Rust	0.2 s
react	7 061	TS / JSX	2.2 s
django	7 061	Python	2.5 s
requests	2 195	Python	0.7 s
gin	1 217	Go	1.0 s
ripgrep	12 851	Rust	4.0 s
ripgrep-shallow	12 851	Rust	0.16 s
TypeScript compiler	81 324	TS / JS / JSON	~22 s

The TypeScript compiler is the worst case — 81k files scanned in 22 seconds. Most real repos sit between tokio and ripgrep. Re-scans skip unchanged content hashes, so warm rescans on edited working trees are typically dominated by the changed-set size, not repo size.

Per-tool MCP latency

Against the 81k-file TypeScript index:

Latency	Tools
< 1 ms	`outline`, `list_files`, `find_references`, `find_callers`, `find_implementations`, `hot_files`, `repo_info`
3–6 ms	`search_symbols`, `call_graph`
4–10 ms	`recent_changes`, `commits_touching`, `find_commits_by_path`, `symbol_history`, `diff_outline`, `diff_file`
20–25 ms	`status`
30–40 ms	`blame_file`, `blame_symbol`
40–200 ms	`workspace_grep`
~200 ms	`search_documents`
350–600 ms	`working_tree_status`

basemind preloads L1 outlines into RAM on serve start, so code-map queries hit no disk. The Fjall LSM inverted index handles ref/caller/impl lookups without scanning blobs. Git tools track gix walk cost; Fjall-backed tools dominate only on enormous histories.

Configuration

Full config lives at schema/basemind-config-v1.schema.json. Minimal example:

# .basemind/basemind.toml
file_watch_glob = "**/*.{rs,ts,tsx,py,go}"
eager_l2 = true

[documents]
enabled = true

Per-query MCP overrides:

{
  "query": "what does kreuzberg do?",
  "reranker_enabled": true,
  "reranker_preset": "bge-reranker-base"
}

Environment variables map mechanically: --llm-api-key ↔ BASEMIND_LLM_API_KEY. Every MCP tool accepts per-query overrides that win over file/env/CLI layers.

CLI command reference

CLI commands mirror MCP tools, grouped by capability. Run with --json for machine-readable output.

Query commands (`basemind query`)

Command	Purpose
`outline <path> [--l2]`	Full per-file structure: symbols + line/col + signatures. `--l2` includes calls + docs.
`symbol <needle> [--kind]`	Substring symbol lookup. Optional kind filter (`function`, `class`, etc.).
`search <needle>`	Full-text regex search over indexed files.
`references <name>`	Substring call-site lookup: all identifiers matching name. Case-sensitive.
`callers <path> <name> [--kind]`	Callers of a specific definition (path + name + optional kind).
`implementations <trait>`	Substring implementation lookup: types implementing/inheriting from names matching trait.
`call-graph <name> [--direction --max-depth]`	BFS call graph (up or down).
`grep <pattern> [--language --path-contains]`	Regex search with optional language / path filter.
`list-files [--path-contains --language]`	Enumerate indexed paths. Optional filters.
`status`	Repository overview: file count, language breakdown, cache directory.
`repo-info`	Git info: current branch, HEAD, origin URL.
`dependents <module>`	Modules that import a given module.

Git commands (`basemind git`)

Command	Purpose
`working-tree-status`	`git status` summary with staged / unstaged classification.
`recent-changes [--limit]`	Recent commits with paths + summaries.
`commits-touching <path>`	Commits that modified a given path.
`find-commits-by-path <pattern>`	Path-filtered commit log.
`hot-files [--limit]`	Churn-ranked files (most frequently modified).
`diff-file <path> <old> <new>`	File diff across revisions.
`diff-outline <path> [--rev]`	Outline diff across revisions.
`blame-file <path>`	Per-line blame (author, commit, message).
`blame-symbol <path> <name>`	Per-symbol blame (when symbol last changed).
`symbol-history <path> <name>`	Cross-commit structural hash of symbol (when body changed).

Memory commands (`basemind memory`, requires `--features memory`)

Command	Purpose
`put <key> <value>`	Store a value (scoped to repo origin).
`get <key>`	Retrieve exact key.
`list [--prefix]`	List all keys or keys matching prefix.
`search <query>`	Vector similarity search over stored values.
`delete <key>`	Delete a key.
`search-documents <query>`	Semantic search over documents + memory (scoped to repo).

Cache commands (`basemind cache`)

Command	Purpose
`stats`	On-disk cache size + orphan accounting (blob store + index + git cache).
`gc`	Reclaim orphaned blobs (safe to run while serve is running).
`clear --component <blobs\|views\|lance\|git-cache\|telemetry\|all>`	Selective or full cache clear. Destructive to `views` and `all` — use CLI, not MCP.

Web commands (`basemind web`, requires `--features crawl`)

Command	Purpose
`scrape <url>`	Ingest a single page (chunk → embed → LanceDB).
`crawl <seed-url>`	Link-following crawl from a seed URL.
`map <url>`	Sitemap + link discovery (no bodies).

Comms commands (`basemind comms`)

Command	Purpose
`rooms`	List joined + joinable rooms (MCP `room_list`).
`join <room>` / `leave <room>`	Join / leave a room.
`room-create <room>`	Create a new room.
`post <room> <subject> [--body … --reply-to … --tag …]`	Post a message.
`history <room>`	Front-matter of recent messages (subject / from / id).
`inbox [--mark-read]`	Front-matter of your inbox (MCP `inbox_read`).
`read <id>`	Fetch one message body by id (MCP `message_get`).
`register --name <handle>` / `agents`	Record your handle / list active agents.
`status` / `start` / `stop`	Broker daemon: report status / ensure running / drain.

Other commands

Command	Purpose
`scan`	Full index scan.
`watch [--no-serve]`	Live re-index on file change. Run `--no-serve` for continuous background watch without the MCP server.
`serve [--no-watch]`	Start the MCP server. By default, watches and incrementally refreshes the index in the background. Run `--no-watch` to disable for very large repos or CI.
`init`	Initialize a `.basemind/` directory (optional — `scan` creates it).
`telemetry`	Show per-tool telemetry histogram + estimated tokens saved.

Installation

Channel	Command	Platforms	Features
Homebrew	`brew install Goldziher/tap/basemind`	macOS, Linux	documents + memory + crawl
npm	`npm install -g basemind`	any Node 14+ platform	documents + memory + crawl
pip	`pip install basemind`	any Python 3.8+ platform	documents + memory + crawl
cargo	`cargo install basemind --locked`	any Rust platform	base
cargo (full)	`cargo install basemind --features full --locked`	any Rust platform	documents + memory + crawl
GH releases	Download binary from releases	macOS · Linux · Windows	documents + memory + crawl

Harness	Install command
Claude Code	`/plugin marketplace add Goldziher/basemind` then `/plugin install basemind@basemind`
Cursor	See Cursor docs for plugin install flow; `basemind` manifest at `.cursor-plugin/plugin.json`
Codex CLI	`/plugins` then search for `basemind`
Codex App	Plugins panel → Coding category → basemind → `+`
Gemini CLI	`gemini extensions install https://github.com/Goldziher/basemind`
OpenCode	Add `{ "plugin": ["basemind-opencode@latest"] }` to `opencode.json`
Factory Droid	`droid plugin --help` (manifest at `.claude-plugin/marketplace.json`)
GitHub Copilot CLI	`copilot plugin --help` (same manifest)
Generic MCP	See "Any MCP client" section above

basemind 0.5.0