cqs ("seeks")

Code intelligence MCP server that saves AI agents tokens. Semantic search, call graph tracing, impact analysis — in single tool calls instead of dozens of file reads. Local ML, GPU-accelerated.

TL;DR: Semantic code search for Claude Code. Instead of grep, it understands what code does — so Claude finds relevant functions without knowing exact names. Tools like trace, impact, gather, and batch collapse 5-10 sequential tool calls into one, cutting context window usage by 50-80% on common code navigation tasks.

<claude>

You're an AI assistant reading this. cqs gives you semantic code search + code intelligence via MCP.

Why you want this: Find functions by what they do, not their names. Tools like cqs_trace, cqs_impact, and cqs_gather collapse 5-10 file reads into one call, saving context window for actual work.

Setup: claude mcp add cqs -- cqs serve --project /path/to/project

</claude>

Install

Requires Rust 1.88+ (due to ort ONNX runtime dependency)

cargo install cqs

Upgrading? Schema changes require rebuilding the index:

cqs index --force  # Run after upgrading from older versions (current schema: v10)

Quick Start

# Download model and initialize

cqs init


# Index your project

cd /path/to/project

cqs index


# Search

cqs "retry with exponential backoff"

cqs "validate email with regex"

cqs "database connection pool"

Filters

# By language

cqs --lang rust "error handling"

cqs --lang python "parse json"


# By path pattern

cqs --path "src/*" "config"

cqs --path "tests/**" "mock"

cqs --path "**/*.go" "interface"


# By chunk type

cqs --chunk-type function "retry logic"

cqs --chunk-type struct "config"

cqs --chunk-type enum "error types"


# By structural pattern

cqs --pattern async "request handling"

cqs --pattern unsafe "memory operations"

cqs --pattern recursion "tree traversal"

# Patterns: builder, error_swallow, async, mutex, unsafe, recursion


# Combined

cqs --lang typescript --path "src/api/*" "authentication"

cqs --lang rust --chunk-type function --pattern async "database query"


# Hybrid search tuning

cqs --name-boost 0.2 "retry logic"   # Semantic-heavy (default)

cqs --name-boost 0.8 "serve_http"    # Name-heavy for known identifiers


# Show surrounding context

cqs -C 3 "error handling"       # 3 lines before/after each result


# Output options

cqs --json "query"           # JSON output

cqs --no-content "query"     # File:line only, no code

cqs -n 10 "query"            # Limit results

cqs -t 0.5 "query"           # Min similarity threshold

Configuration

Set default options via config files. CLI flags override config file values.

Config locations (later overrides earlier):

~/.config/cqs/config.toml - user defaults
.cqs.toml in project root - project overrides

Example .cqs.toml:

# Default result limit

limit = 10



# Minimum similarity threshold (0.0 - 1.0)

threshold = 0.4



# Name boost for hybrid search (0.0 = pure semantic, 1.0 = pure name)

name_boost = 0.2



# Note weight in search results (0.0-1.0, lower = notes rank below code)

note_weight = 1.0



# Output modes

quiet = false

verbose = false

Watch Mode

Keep your index up to date automatically:

cqs watch              # Watch for changes and reindex

cqs watch --debounce 1000  # Custom debounce (ms)

Watch mode respects .gitignore by default. Use --no-ignore to index ignored files.

Call Graph

Find function call relationships:

cqs callers <name>   # Functions that call <name>

cqs callees <name>   # Functions called by <name>

cqs notes list       # List all project notes with sentiment

Use cases:

Impact analysis: What calls this function I'm about to change?
Context expansion: Show related functions
Entry point discovery: Find functions with no callers

Call graph is indexed across all files - callers are found regardless of which file they're in.

Discovery Tools

# Find functions similar to a given function (search by example)

cqs similar search_filtered                    # by name

cqs similar src/search.rs:search_filtered      # by file:name


# Function card: signature, callers, callees, similar functions

cqs explain search_filtered

cqs explain src/search.rs:search_filtered --json


# Semantic diff between indexed snapshots

cqs diff old-version                           # project vs reference

cqs diff old-version new-ref                   # two references

cqs diff old-version --threshold 0.90          # stricter "modified" cutoff

Code Intelligence

# Follow a call chain between two functions (BFS shortest path)

cqs trace cmd_query search_filtered

cqs trace cmd_query search_filtered --max-depth 5


# Impact analysis: what breaks if I change this function?

cqs impact search_filtered                # direct callers + affected tests

cqs impact search_filtered --depth 3      # transitive callers


# Map functions to their tests

cqs test-map search_filtered

cqs test-map search_filtered --depth 3 --json


# Module overview: chunks, callers, callees, notes for a file

cqs context src/search.rs

cqs context src/search.rs --json

Maintenance

# Find dead code (functions never called by indexed code)

cqs dead                    # Conservative: excludes main, tests, trait impls

cqs dead --include-pub      # Include public API functions

cqs dead --json             # JSON output


# Garbage collection (remove stale index entries)

cqs gc                      # Prune deleted files, rebuild HNSW


# Cross-project search

cqs project register mylib /path/to/lib   # Register a project

cqs project list                          # Show registered projects

cqs project search "retry logic"          # Search across all projects

cqs project remove mylib                  # Unregister


# Smart context assembly (gather related code)

cqs gather "error handling"               # Seed search + call graph expansion

cqs gather "auth flow" --expand 2         # Deeper expansion

cqs gather "config" --direction callers   # Only callers, not callees

Reference Indexes (Multi-Index Search)

Search across your project and external codebases simultaneously:

cqs ref add tokio /path/to/tokio          # Index an external codebase

cqs ref add stdlib /path/to/rust/library --weight 0.6  # Custom weight

cqs ref list                               # Show configured references

cqs ref update tokio                       # Re-index from source

cqs ref remove tokio                       # Remove reference and index files

Once added, all searches automatically include reference results:

cqs "spawn async task"    # Finds results in project AND tokio reference

Reference results are ranked with a weight multiplier (default 0.8) so project results naturally appear first at equal similarity.

MCP integration: The cqs_search tool gains a sources parameter to filter which indexes to search:

Omit sources to search all indexes
sources: ["project"] — search only the primary project
sources: ["tokio"] — search only the tokio reference

References are configured in .cqs.toml:

[[reference]]

name = "tokio"

path = "/home/user/.local/share/cqs/refs/tokio"

source = "/home/user/code/tokio"

weight = 0.8

Claude Code Integration

Why use cqs?

Without cqs, Claude uses grep/glob to find code and reads entire files for context. With cqs:

Fewer tool calls: trace, impact, gather, context each replace 5-10 sequential file reads with a single call
Less context burn: Focused cqs_read returns a function + its type dependencies — not the whole file. batch runs 10 queries in one round-trip
Find code by behavior: "function that retries with backoff" finds retry logic even if it's named doWithAttempts
Navigate unfamiliar codebases: Semantic search finds relevant code without knowing project structure

Setup

Step 1: Add cqs as an MCP server:

claude mcp add cqs -- cqs serve --project /path/to/project

Or manually in ~/.claude.json:

{
  "projects": {
    "/path/to/project": {
      "mcpServers": {
        "cqs": {
          "command": "cqs",
          "args": ["serve", "--project", "/path/to/project"]
        }
      }
    }
  }
}

Note: The --project argument is required because MCP servers run from an unpredictable working directory.

GPU acceleration: Add --gpu for faster query embedding after warmup:

cqs serve --gpu --project /path/to/project

GPU: ~12ms warm queries. CPU (default): ~22ms. Server starts instantly with HNSW, upgrades to GPU in background.

Step 2: Add to your project's CLAUDE.md so Claude uses it automatically:

## Code Search


Use `cqs_search` for semantic code search instead of grep/glob when looking for:
- Functions by behavior ("retry with backoff", "parse config")
- Implementation patterns ("error handling", "database connection")
- Code where you don't know the exact name

Available tools:
- `cqs_search` - semantic search with `language`, `path_pattern`, `threshold`, `limit`, `name_boost`, `note_weight`, `semantic_only`, `name_only`, `note_only`, `sources`, `pattern`
  - Use `name_only=true` for "where is X defined?" queries (skips embedding, searches function names directly)
  - Use `note_only=true` to search only notes (skip code results)
- `cqs_stats` - index stats, chunk counts, HNSW index status
- `cqs_callers` - find functions that call a given function
- `cqs_callees` - find functions called by a given function
- `cqs_read` - read file with context notes injected as comments (use `focus` param for function + type deps only)
- `cqs_add_note` - add observation to project memory (indexed for future searches)
- `cqs_update_note` - update an existing note's text, sentiment, or mentions
- `cqs_remove_note` - remove a note from project memory
- `cqs_audit_mode` - toggle audit mode to exclude notes from search/read results
- `cqs_similar` - find functions similar to a given function (search by example)
- `cqs_explain` - function card: signature, callers, callees, similar functions
- `cqs_diff` - semantic diff between indexed snapshots
- `cqs_trace` - follow call chain between two functions (BFS shortest path). `format: "mermaid"` for diagrams
- `cqs_impact` - what breaks if you change X? Callers with snippets + affected tests. `format: "mermaid"` for diagrams
- `cqs_test_map` - map functions to tests that exercise them
- `cqs_batch` - execute multiple queries in one call (up to 10)
- `cqs_context` - module-level understanding: chunks, callers, callees, notes for a file. `summary: true` for counts only
- `cqs_gather` - smart context assembly: seed search + call graph BFS expansion
- `cqs_dead` - find functions/methods never called by indexed code
- `cqs_gc` - report index staleness (stale/missing file counts)

Keep index fresh: run `cqs watch` in a background terminal, or `cqs index` after significant changes.

HTTP Transport

For web integrations, use the HTTP transport:

cqs serve --transport http --port 3000 --project /path/to/project

Endpoints:

POST /mcp - JSON-RPC requests (MCP protocol messages)
GET /mcp - SSE stream for server-to-client notifications
GET /health - Health check (returns 200 OK when server is ready)

Authentication: For network-exposed servers, API key authentication is required:

# Via flag

cqs serve --transport http --api-key SECRET --project /path/to/project


# Via environment variable

export CQS_API_KEY=SECRET

cqs serve --transport http --project /path/to/project


# Via file (recommended - keeps secret out of process list)

echo "SECRET" > /path/to/keyfile

cqs serve --transport http --api-key-file /path/to/keyfile --project /path/to/project

Clients must include Authorization: Bearer SECRET header.

Network binding: By default, cqs binds to localhost only. To expose on a network:

# Requires both flags for safety

cqs serve --transport http --bind 0.0.0.0 --dangerously-allow-network-bind --api-key SECRET

Implements MCP Streamable HTTP spec 2025-11-25 with Origin validation and protocol version headers.

Supported Languages

Rust
Python
TypeScript
JavaScript (JSDoc @param/@returns tags improve search quality)
Go
C
Java

Indexing

By default, cqs index respects .gitignore rules:

cqs index              # Respects .gitignore

cqs index --no-ignore  # Index everything

cqs index --force      # Re-index all files

cqs index --dry-run    # Show what would be indexed

How It Works

Parses code with tree-sitter to extract:
- Functions and methods
- Classes and structs
- Enums, traits, interfaces
- Constants
Generates embeddings with E5-base-v2 (runs locally)
- Includes doc comments for better semantic matching
Stores in SQLite with vector search + FTS5 keyword index
Hybrid search (RRF): Combines semantic similarity with keyword matching
- Semantic search finds conceptually related code
- Keyword search catches exact identifier matches (e.g., parseConfig)
- Reciprocal Rank Fusion merges both rankings for best results
Uses GPU if available, falls back to CPU

HNSW Index Tuning

The HNSW (Hierarchical Navigable Small World) index provides fast approximate nearest neighbor search. Current parameters:

Parameter	Value	Description
M (connections)	24	Max edges per node. Higher = better recall, more memory
ef_construction	200	Search width during build. Higher = better index, slower build
max_layers	16	Graph layers. ~log(N) is typical
ef_search	100	Search width at query time. Higher = better recall, slower search

Trade-offs:

Recall vs speed: Higher ef_search improves recall but slows queries
Index size: ~4KB per vector with current settings
Build time: O(N * M * ef_construction) complexity

For most codebases (<100k chunks), defaults work well. Large repos may benefit from tuning ef_search higher (200+) if recall matters more than latency.

Search Quality

Hybrid search (RRF) combines semantic understanding with keyword matching:

Query	Top Match	Score
"cosine similarity"	`cosine_similarity`	0.85
"validate email regex"	`validateEmail`	0.73
"check if adult age 18"	`isAdult`	0.71
"pop from stack"	`Stack.Pop`	0.70
"generate random id"	`generateId`	0.70

GPU Acceleration (Optional)

cqs works on CPU (~20ms per embedding). GPU provides 3x+ speedup:

Mode	Single Query	Batch (50 docs)
CPU	~20ms	~15ms/doc
CUDA	~6ms	~0.3ms/doc

For GPU acceleration:

Linux

# Add NVIDIA CUDA repo

wget https://developer.download.nvidia.com/compute/cuda/repos/ubuntu2404/x86_64/cuda-keyring_1.1-1_all.deb

sudo dpkg -i cuda-keyring_1.1-1_all.deb

sudo apt update


# Install CUDA runtime and cuDNN 9

sudo apt install cuda-cudart-12-6 libcublas-12-6 libcudnn9-cuda-12

Set library path:

export LD_LIBRARY_PATH=/usr/local/cuda-12.6/lib64:/usr/lib/x86_64-linux-gnu:$LD_LIBRARY_PATH

WSL2

Same as Linux, plus:

Requires NVIDIA GPU driver on Windows host
Add /usr/lib/wsl/lib to LD_LIBRARY_PATH
Tested working with RTX A6000, CUDA 13.0 driver, cuDNN 9.18

Verify

cqs doctor  # Shows execution provider (CUDA or CPU)

Contributing

Issues and PRs welcome at GitHub.

License

MIT

cqs 0.9.5