Please check the build logs for more information.
See Builds for ideas on how to fix a failed build, or Metadata for how to configure docs.rs builds.
If you believe this is docs.rs' fault, open an issue.
Natural-language search that works like grep. Fast, local, GPU-accelerated, and built for coding agents.
- Semantic: Finds concepts ("where do transactions get created?"), not just strings.
- GPU-Accelerated: CUDA support via candle for fast embeddings on NVIDIA GPUs.
- Local & Private: 100% local embeddings. No API keys required.
- Auto-Isolated: Each repository gets its own index automatically.
- On-Demand Grammars: Tree-sitter WASM grammars download automatically as needed.
- Agent-Ready: Native MCP server and Claude Code integration.
Quick Start
-
Install
Or build from source:
For CPU-only builds (no CUDA):
-
Setup (Recommended)
Downloads embedding models (~500MB) and tree-sitter grammars upfront. If you skip this, models download automatically on first use.
-
Search
Your first search will automatically index the repository. Each repository is automatically isolated with its own index. Switching between repos "just works".
Coding Agent Integration
Claude Code
- Run
smgrep claude-install - Open Claude Code (
claude) and ask questions about your codebase. - The plugin auto-starts the
smgrep servedaemon and provides semantic search.
MCP Server
smgrep includes a built-in MCP (Model Context Protocol) server:
This exposes a sem_search tool that agents can use for semantic code search. The server auto-starts the background daemon if needed.
Commands
smgrep [query]
The default command. Searches the current directory using semantic meaning.
Options:
| Flag | Description | Default |
|---|---|---|
-m <n> |
Max total results to return | 10 |
--per-file <n> |
Max matches per file | 1 |
-c, --content |
Show full chunk content | false |
--compact |
Show file paths only | false |
--scores |
Show relevance scores | false |
-s, --sync |
Force re-index before search | false |
--dry-run |
Show what would be indexed | false |
--json |
JSON output format | false |
--no-rerank |
Skip ColBERT reranking | false |
--plain |
Disable ANSI colors | false |
Examples:
# General concept search
# Deep dive (more matches per file)
# Just the file paths
# JSON for scripting
smgrep index
Manually indexes the repository.
smgrep serve
Runs a background daemon with file watching for instant searches.
- Keeps LanceDB and embedding models resident for fast responses
- Watches the repo and incrementally re-indexes on change
- Communicates via Unix socket
smgrep stop / smgrep stop-all
Stop running daemons.
smgrep status
Show status of running daemons.
smgrep list
Lists all indexed repositories and their metadata.
smgrep doctor
Checks installation health, model availability, and grammar status.
GPU Acceleration
smgrep uses candle for ML inference with optional CUDA support.
With CUDA (default):
Requires CUDA toolkit installed with environment configured:
# or your CUDA installation path
Embedding speed is significantly faster on NVIDIA GPUs.
CPU-only:
Environment variables:
SMGREP_DISABLE_GPU=1- Force CPU even when CUDA is availableSMGREP_BATCH_SIZE=N- Override batch size (auto-adapts on OOM)
Architecture
smgrep combines several techniques for high-quality semantic search:
-
Smart Chunking: Tree-sitter parses code by function/class boundaries, ensuring embeddings capture complete logical blocks. Grammars download on-demand as WASM modules.
-
Hybrid Search: Dense embeddings (sentence-transformers) for broad recall, ColBERT reranking for precision.
-
Quantized Storage: ColBERT embeddings are quantized to int8 for efficient storage in LanceDB.
-
Automatic Repository Isolation: Stores are named by git remote URL or directory hash.
-
Incremental Indexing: File watcher detects changes and updates only affected chunks.
Supported languages: TypeScript, JavaScript, Python, Go, Rust, C, C++, Java, Ruby, PHP, Swift, HTML, CSS, Bash, JSON, YAML
Configuration
smgrep uses a TOML config file at ~/.smgrep/config.toml. All options can also be set via environment variables with the SMGREP_ prefix.
Config File
# ~/.smgrep/config.toml
# ============================================================================
# Models
# ============================================================================
# Dense embedding model (HuggingFace model ID)
# Used for initial semantic similarity search
= "ibm-granite/granite-embedding-30m-english"
# ColBERT reranking model (HuggingFace model ID)
# Used for precise reranking of search results
= "answerdotai/answerai-colbert-small-v1"
# Model dimensions (must match the models above)
= 384
= 96
# Query prefix (some models require a prefix like "query: ")
= ""
# Maximum sequence lengths for tokenization
= 256
= 256
# ============================================================================
# Performance
# ============================================================================
# Batch size for embedding computation
# Higher = faster but more memory. Auto-reduces on OOM.
= 48
= 96
# Maximum threads for parallel processing
= 32
# Force CPU inference even when CUDA is available
= false
# Low-impact mode: reduces resource usage for background indexing
= false
# Fast mode: skip ColBERT reranking for quicker (but less precise) results
= false
# ============================================================================
# Server
# ============================================================================
# Idle timeout: shutdown daemon after this many seconds of inactivity
= 1800 # 30 minutes
# How often to check for idle timeout
= 60
# Timeout for embedding worker operations (milliseconds)
= 60000
# ============================================================================
# Debug
# ============================================================================
# Enable model loading debug output
= false
# Enable embedding debug output
= false
# Enable profiling
= false
# Skip saving metadata (for testing)
= false
Environment Variables
Any config option can be set via environment variable with the SMGREP_ prefix:
# Examples
| Variable | Description | Default |
|---|---|---|
SMGREP_STORE |
Override store name | auto-detected |
SMGREP_DISABLE_GPU |
Force CPU inference | false |
SMGREP_DEFAULT_BATCH_SIZE |
Embedding batch size | 48 |
SMGREP_LOW_IMPACT |
Reduce resource usage | false |
SMGREP_FAST_MODE |
Skip reranking | false |
Ignoring Files
smgrep respects .gitignore and .smgrepignore files.
Create .smgrepignore in your repository root:
# Ignore generated files
dist/
*.min.js
# Ignore test fixtures
test/fixtures/
Manual Store Management
- View all stores:
smgrep list - Override auto-detection:
smgrep --store custom-name "query" - Data location:
~/.smgrep/
Troubleshooting
- Index feels stale? Run
smgrep indexto refresh. - Weird results? Run
smgrep doctorto verify models and grammars. - Need a fresh start?
smgrep index --resetor delete~/.smgrep/. - GPU OOM? Batch size auto-reduces, or set
SMGREP_DISABLE_GPU=1.
Building from Source
# Run tests
Acknowledgments
smgrep is inspired by osgrep and mgrep by MixedBread.
License
Licensed under the Apache License, Version 2.0. See LICENSE for details.