cqs ("seeks")
Semantic code search with local ML embeddings. Find functions by concept, not name. GPU-accelerated. MCP server included.
TL;DR: Semantic code search for Claude Code. Instead of grep, it understands what code does - so Claude finds relevant functions without knowing exact names. Means Claude finds the right code faster and doesn't miss stuff buried in unfamiliar codebases.
<claude>You're an AI assistant reading this. cqs gives you semantic code search via MCP.
Why you want this: Find functions by what they do, not their names. When users ask about "retry logic" or "error handling", you find relevant code even in unfamiliar codebases - no need to know exact identifiers.
Setup:
claude mcp add cqs -- cqs serve --project /path/to/project
</claude>
Install
Upgrading? Schema changes require rebuilding the index:
Quick Start
# Download model and initialize
# Index your project
# Search
Filters
# By language
# By path pattern
# Combined
# Hybrid search tuning
# Show surrounding context
# Output options
Watch Mode
Keep your index up to date automatically:
Watch mode respects .gitignore by default. Use --no-ignore to index ignored files.
Call Graph
Find function call relationships:
Use cases:
- Impact analysis: What calls this function I'm about to change?
- Context expansion: Show related functions
- Entry point discovery: Find functions with no callers
Call graph is indexed across all files - callers are found regardless of which file they're in.
Claude Code Integration
Why use cqs?
Without cqs, Claude Code uses grep/glob to find code - which only works if you know the exact names. With cqs, Claude can:
- Find code by behavior: "function that retries with backoff" finds retry logic even if it's named
doWithAttempts - Navigate unfamiliar codebases: Claude finds relevant code without knowing the project structure
- Catch related code: Semantic search surfaces similar patterns across the codebase that text search misses
Setup
Step 1: Add cqs as an MCP server:
Or manually in ~/.claude.json:
Note: The --project argument is required because MCP servers run from an unpredictable working directory.
GPU acceleration: Add --gpu for faster query embedding after warmup:
GPU: ~12ms warm queries. CPU (default): ~22ms. Server starts instantly with HNSW, upgrades to GPU in background.
Step 2: Add to your project's CLAUDE.md so Claude uses it automatically:
Use `cqs_search` for semantic code search instead of grep/glob when looking for:
- --
Available tools:
- ---
Keep index fresh: run `cqs watch` in a background terminal, or `cqs index` after significant changes.
HTTP Transport
For web integrations, use the HTTP transport:
Endpoints:
POST /mcp- JSON-RPC requestsGET /mcp- SSE stream for server-to-client messagesGET /health- Health check
Authentication: For network-exposed servers, API key authentication is required:
# Via flag
# Via environment variable
Clients must include Authorization: Bearer SECRET header.
Network binding: By default, cqs binds to localhost only. To expose on a network:
# Requires both flags for safety
Implements MCP Streamable HTTP spec 2025-11-25 with Origin validation and protocol version headers.
Supported Languages
- Rust
- Python
- TypeScript
- JavaScript (JSDoc
@param/@returnstags improve search quality) - Go
Indexing
By default, cqs index respects .gitignore rules:
How It Works
- Parses code with tree-sitter to extract:
- Functions and methods
- Classes and structs
- Enums, traits, interfaces
- Constants
- Generates embeddings with E5-base-v2 (runs locally)
- Includes doc comments for better semantic matching
- Stores in SQLite with vector search + FTS5 keyword index
- Hybrid search (RRF): Combines semantic similarity with keyword matching
- Semantic search finds conceptually related code
- Keyword search catches exact identifier matches (e.g.,
parseConfig) - Reciprocal Rank Fusion merges both rankings for best results
- Uses GPU if available, falls back to CPU
Search Quality
Hybrid search (RRF) combines semantic understanding with keyword matching:
| Query | Top Match | Score |
|---|---|---|
| "cosine similarity" | cosine_similarity |
0.85 |
| "validate email regex" | validateEmail |
0.73 |
| "check if adult age 18" | isAdult |
0.71 |
| "pop from stack" | Stack.Pop |
0.70 |
| "generate random id" | generateId |
0.70 |
GPU Acceleration (Optional)
cqs works on CPU (~20ms per embedding). GPU provides 3x+ speedup:
| Mode | Single Query | Batch (50 docs) |
|---|---|---|
| CPU | ~20ms | ~15ms/doc |
| CUDA | ~6ms | ~0.3ms/doc |
For GPU acceleration:
Linux
# Add NVIDIA CUDA repo
# Install CUDA runtime and cuDNN 9
Set library path:
WSL2
Same as Linux, plus:
- Requires NVIDIA GPU driver on Windows host
- Add
/usr/lib/wsl/libtoLD_LIBRARY_PATH - Tested working with RTX A6000, CUDA 13.0 driver, cuDNN 9.18
Verify
Contributing
Issues and PRs welcome at GitHub.
License
MIT