basemind 0.0.1

Code-map MCP server + scanner — content-addressed, Fjall-backed inverted index over tree-sitter outlines
docs.rs failed to build basemind-0.0.1
Please check the build logs for more information.
See Builds for ideas on how to fix a failed build, or Metadata for how to configure docs.rs builds.
If you believe this is docs.rs' fault, open an issue.

basemind

Give your AI coding agent a brain for your repo.

basemind is a code-map MCP server: it indexes your codebase into a queryable map so AI coding agents — Claude Code, Cursor, Continue, anything that speaks MCP — get instant semantic answers about your code. Where is this defined? Who calls it? When did it change? What's churning?

Sub-millisecond queries. 300+ languages out of the box. Local-only. Built in Rust.

License: MIT crates.io npm PyPI CI


Why your agent needs this

Today, agents read code by grepping blind. Ask Claude "who calls parseQuery?" and it ripgreps the string — you get hits in docs, tests, comments, and 14 unrelated files. The agent burns context filtering noise, then guesses.

LSPs are the semantic answer, but they're single-language, slow to start, and useless across a polyglot monorepo.

basemind is the missing layer. One index, every language, semantic-quality answers at grep speed — exposed to the agent over MCP as concrete tools (find_callers, find_references, outline, symbol_history, blame_symbol, hot_files, …) instead of "go grep again."


30-second setup

Install (pick one):

brew install Goldziher/tap/basemind     # macOS, Linux
npm install -g basemind                 # any Node 14+ platform
pip install basemind                    # any Python 3.8+ platform
cargo install basemind --locked         # build from source

Index your repo:

cd /path/to/your/repo
basemind scan

Wire it into Claude Code — drop this into ~/.claude.json (or your project's .mcp.json):

{
  "mcpServers": {
    "basemind": {
      "command": "basemind",
      "args": ["serve"],
      "cwd": "/abs/path/to/your/repo"
    }
  }
}

Done. Restart Claude Code, and your agent has eight code-map tools and twelve git tools at its fingertips.

Same JSON shape works for Cursor, Continue, Cline, and any other MCP client.


What your agent gets

Code-map tools

Tool What the agent can finally do
outline "Give me this file's structure" — symbols, line/col, signatures, imports. One call replaces five Reads.
search_symbols "Find anything named useAuth" — substring match across every indexed symbol, kind-filterable.
find_references "Where is parseQuery called?" — indexed call-site lookup. No regex noise.
find_callers "Who calls User.save()?" — resolves the definition first, then scans.
dependents "What imports this module?" — reverse import lookup.
list_files "What files are in src/auth/?" — indexed path + language filters.
status "What languages does this repo use?" — file count + language breakdown.
repo_info Branch, HEAD, workdir at a glance.

Git-aware tools

Tool What the agent can finally do
symbol_history "When did validateToken actually change?" — tree-sitter × git, comment/format-stable diffs.
blame_file / blame_symbol "Who wrote this and why?" — line-range or symbol-scoped blame.
hot_files "What's been churning?" — top-K most-changed files in the last N commits.
recent_changes "What changed recently on this branch?"
commits_touching "Show me every commit that touched auth.rs."
diff_outline "What symbols differ between main and HEAD?" — structural diff.
diff_file "Give me the unified diff for auth.rs across these revs."
working_tree_status "What's staged / unstaged / untracked right now?"

Every tool returns JSON. Responses are capped (limit, default 100, max 1000) so the agent's context doesn't explode.


Performance

A 39 270-file TypeScript repo. Apple Silicon, release build:

What Time
Cold scan (full index) 12.4 s
Cached scan (no changes) 1.6 s
MCP server startup 3.1 s, 77 MB RSS
status query 1.2 ms
outline (1571 symbols) 1.9 ms
search_symbols 1–3 ms
find_references("spawn") (tokio) < 5 ms

basemind preloads L1 outlines into RAM on serve start, so cross-file queries are sub-millisecond. The Fjall LSM inverted index handles ref/caller lookups without scanning blobs.


Languages

300+ tree-sitter grammars ship via tree-sitter-language-pack. basemind dynamically loads them on first use and caches them locally.

First-class outlines — full signatures, kinds, decorators, calls, imports, docstrings — ship for:

Rust · Python · TypeScript · TSX · JavaScript · Go

Best-effort outlines via the TSLP tags.scm fallback — covers ~100 grammars including Kotlin, C#, Swift, C++, Scala, Solidity, Lua, Ruby, PHP, Java, …

Languages without an upstream tags.scm (JSON, YAML, TOML) still parse and appear in list_files; they just don't expose symbols.


Why basemind, specifically

  • Built for agents, not humans. Every tool exists because an agent needs it, not because it makes a cute terminal demo.
  • Semantic quality, grep speed. Tree-sitter parses → content-addressed blobs → Fjall LSM inverted index → sub-millisecond MCP responses.
  • Polyglot by default. One index, every language. No LSP-per-language zoo. No "we don't support that yet."
  • Local-only. No SaaS. No telemetry. No cloud round-trip. Your code never leaves the machine.
  • Deterministic. Content-addressed blobs (blake3), stable hashes, reproducible across machines.
  • Pure Rust. One static binary. No Python runtime, no Node runtime, no JVM. basemind serve adds < 80 MB to your agent's stack.

CLI

basemind is also a CLI — useful for piping into shell tools, CI checks, or just inspecting a repo without spinning up an MCP server.

basemind init                              # write .basemind/basemind.toml with defaults
basemind scan                              # index the working tree
basemind scan --staged                     # index what's in git's staging area
basemind scan --rev <REV>                  # index a commit / branch / sha
basemind watch                             # long-running watcher; index on file change
basemind serve [--view <name>]             # MCP stdio server for agents
basemind query outline <path> [--l2]       # symbols, imports (+ docs/calls with --l2)
basemind query symbol <needle> [--kind K]  # substring search across symbols
basemind query dependents <module>         # reverse-lookup via imports
basemind hook install                      # install pre-commit hook (--staged scan)
basemind lang {list, install, clean}       # manage downloaded tree-sitter grammars
basemind cache clear                       # drop .basemind/git-cache/

Global flags: -q/--quiet, -v/--verbose, --no-color (NO_COLOR honored).


Architecture

A short tour. See docs/ARCHITECTURE.md for the long version.

  • Scanner (src/scanner.rs) — rayon-parallel walker over the gitignore-aware file set. Extracts L1 (symbols + imports), L2 (calls + docs), L3 (structural hashes) per file.
  • Content-addressed blobs (src/store.rs) — msgpack at .basemind/blobs/<blake3>.{l1,l2,l3}.msgpack. Two files with identical content share the same blob. Re-scan skips unchanged hashes.
  • Inverted index (src/index/) — pure-Rust Fjall LSM keyspace at .basemind/views/<view>/index.fjall/. Six keyspaces drive symbol search, reference lookup, dependents.
  • MCP surface (src/mcp/) — stdio JSON-RPC via rmcp. Tool descriptions are the routing surface for agents; semantics (substring vs prefix, scope-aware vs name-only, capped) are stated honestly.
  • Git layer (src/git.rs, src/git_cache.rs) — gix-backed blame, log, diff, status. Sha-keyed disk cache (.basemind/git-cache/) makes warm queries free.

Views

A view is a code map for a snapshot of the repo. Each view has its own index under .basemind/views/<view>/; blobs are shared in .basemind/blobs/.

  • working (default) — the on-disk working tree
  • staged — git staging area; what's about to be committed
  • rev-<sha7> — whatever you scanned with basemind scan --rev <REV>

They coexist — running one doesn't clobber the others. The pre-commit hook installed by basemind hook install indexes staged, so the hook reflects exactly what's being committed.

Live refresh

Run basemind watch in one terminal and basemind serve in another: the server watches the index, rebuilds its in-RAM map off-thread, and atomically swaps. Queries reflect filesystem changes within ~150 ms with no serve restart.


Hardening

basemind ships with a real-OSS hardening harness — 8 upstream repos (ripgrep, tokio, microsoft/TypeScript, facebook/react, django, requests, gin, plus a shallow ripgrep variant) cloned, scanned, and MCP-swept on every release. Canary assertions catch regressions before they ship:

./scripts/harden.sh    # ~10 minutes; produces /tmp/basemind-harden/results.ndjson

The harness is #[ignore]-gated from normal cargo test. Invoked nightly and on-dispatch from CI.


Development

git clone https://github.com/Goldziher/basemind && cd basemind
task setup     # cargo fetch + prek install
task check     # lint + test
task build     # release binary

Pre-commit hooks via prek cover Rust (cargo fmt/clippy/sort/machete/deny/rustdoc-lint), markdown, shell, JSON/YAML/TOML, file-safety basics, and commit-message linting via gitfluff.

Contributing guidelines: see CONTRIBUTING.md.


License

MIT.