seekr-code
A semantic code search engine, smarter than grep.
Supports text regex + semantic vector + AST pattern search — 100% local, no data leaves your machine.
Features
- 🔍 Text Search — High-performance regex matching across code
- 🧠 Semantic Search — Local ONNX-based embedding with HuggingFace WordPiece tokenizer + HNSW ANN index, find code by meaning
- 🌳 AST Pattern Search — Match function signatures, structs, classes via Tree-sitter (e.g.,
fn(*) -> Result) - ⚡ Hybrid Mode — Combine all three via 3-way Reciprocal Rank Fusion (RRF) for best results
- 📡 MCP Server — Model Context Protocol support for AI editor integration
- 🌐 HTTP API — REST API for integration with other tools
- 🔄 Incremental Indexing — Only re-process changed files
- 👁️ Watch Daemon — Real-time file monitoring with automatic incremental re-indexing
- 🗂️ 15 Languages — Rust, Python, JavaScript, TypeScript, Go, Java, C, C++, Ruby, Bash, HTML, CSS, JSON, TOML, YAML
Installation
From crates.io
From source
After installation, the seekr-code binary will be available in your $PATH.
Requirements
- Rust 1.85.0 or later
- A C/C++ compiler (for building tree-sitter grammars)
Quick Start
1. Build an index
# Index the current project
# Index a specific project path
# Force a full rebuild (ignore incremental state)
2. Search code
# Hybrid search (default — combines text + semantic + AST)
# Text regex search
# Semantic search (search by meaning)
# AST pattern search
3. Check index status
4. JSON output
All commands support --json for machine-readable output:
Server Mode
HTTP API
# Start the HTTP API server (default: 127.0.0.1:7720)
# Custom host and port
# Start with watch daemon — auto re-index on file changes
Endpoints:
| Method | Path | Description |
|---|---|---|
| POST | /search | Search code |
| POST | /index | Trigger index build |
| GET | /status | Query index status |
| GET | /health | Health check |
Example:
MCP Server (AI Editor Integration)
# Start as MCP server over stdio
MCP Tools:
seekr_search— Search code (text, semantic, AST, hybrid modes)seekr_index— Build/rebuild the search indexseekr_status— Get index status
Example MCP configuration (e.g., for Claude Desktop, CodeBuddy, etc.):
AST Pattern Syntax
[async] [pub] fn [name]([param_types, ...]) [-> return_type]
class ClassName
struct StructName
enum EnumName
trait TraitName
Examples:
| Pattern | Matches |
|---|---|
fn(string) -> number |
Functions taking a string, returning number |
fn(*) -> Result |
Any function returning Result |
async fn(*) |
Any async function |
fn authenticate(*) |
Functions named "authenticate" |
struct *Config |
Structs ending with "Config" |
class *Service |
Classes ending with "Service" |
enum *Error |
Enums ending with "Error" |
Configuration
Configuration file: ~/.seekr/config.toml
# Index storage directory
= "~/.seekr/indexes"
# ONNX model directory
= "~/.seekr/models"
# Embedding model name
= "all-MiniLM-L6-v2"
# Maximum file size to index (bytes)
= 10485760
[]
= "127.0.0.1"
= 7720
[]
= 2
= 20
= 60
[]
= 32
How It Works
- Scanner — Walks the project directory, respects
.gitignore, filters by file type/size - Parser — Uses Tree-sitter to parse source files into semantic code chunks (functions, classes, structs, etc.)
- Embedder — Generates vector embeddings using ONNX Runtime + all-MiniLM-L6-v2 with HuggingFace WordPiece tokenizer
- Index — Builds inverted text index + HNSW vector index, persisted to disk via bincode binary format
- Search — Text regex, semantic HNSW ANN (with brute-force KNN fallback), AST pattern matching, fused via 3-way RRF
- Watch — Optional file system monitoring with debounced incremental re-indexing
Benchmarks
Run the benchmark suite with:
Benchmarks cover:
- Index construction (100 / 500 / 1000 chunks)
- Vector search latency (500 / 1000 / 5000 chunks)
- Text search latency (inverted index)
- Cosine similarity computation (384d)
- Index save/load throughput (bincode)
Environment Variables
| Variable | Description |
|---|---|
SEEKR_LOG |
Log level filter (e.g., seekr_code=debug) |
RUST_LOG |
Fallback log level if SEEKR_LOG is not set |