Code Search
A fast, intelligent CLI tool for searching and analyzing codebases, built in Rust. Designed as a code-aware supplement to AI agents and LLMs, providing precise structural understanding that semantic search cannot deliver.
π― Why Code Search Matters for AI Agents
The Problem: LLMs and RAG systems treat code as text, losing critical structural information which can be provided by search, static analysis of the code.
The Solution: Code Search provides structured, precise code intelligence that agents can trust:
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β AI AGENT / LLM β
β "I need to understand the authentication module" β
ββββββββββββββββββββββββββββ¬βββββββββββββββββββββββββββββββββββββββ
β
βββββββββββββββββββ΄ββββββββββββββββββ
β β
ββββββΌβββββ βββββββΌββββββ
β RAG β β CodeSearchβ
β Semanticβ β Structuralβ
ββββββ¬βββββ βββββββ¬ββββββ
β β
"Files about "auth.rs: L15-45
authentication" fn authenticate()
(fuzzy, chunked) fn verify_token()
3 callers, 2 deps"
(precise, complete)
π§ Key Capabilities
1. Precise Pattern Matching (Not Semantic Guessing)
| What You Need | RAG/Embeddings | Code Search |
|---|---|---|
Find fn authenticate |
Returns similar functions | Returns exact function + line number |
| Find all TODO comments | Misses non-standard formats | Regex: TODO|FIXME|HACK catches all |
| Find unused imports | Cannot detect | Analyzes actual usage |
Rename oldFunc β newFunc |
Suggests similar names | Finds every exact occurrence |
2. Language-Aware Intelligence (48 Languages)
# Each language has tailored patterns
Understands:
- Function definitions, class structures, imports
- Comment patterns (single-line, multi-line, doc comments)
- Language-specific syntax (traits, interfaces, decorators)
3. Code Quality Analysis
| Analysis | What It Finds |
|---|---|
| Complexity | Cyclomatic & cognitive complexity scores |
| Dead Code | Unused imports, functions, classes |
| Duplicates | Similar code blocks (DRY violations) |
4. MCP Server for Agent Integration
Exposes code intelligence as tools that AI agents can call:
Available Tools:
search_code- Find patterns with fuzzy/regex supportlist_files- Enumerate codebase with filtersanalyze_codebase- Get metrics and statistics
π How It Complements RAG & LLMs
| Aspect | RAG Alone | + Code Search |
|---|---|---|
| Find function | "Similar to auth..." | Exact: auth.rs:L42 |
| Count usages | "Mentioned several times" | Precise: "Called 7 times in 3 files" |
| Find all usages | Suggests changes | Validates all occurrences found |
| Dead code | Cannot detect | Lists unused with line numbers |
| Complexity | No metrics | Cyclomatic score: 15 |
The Hybrid Approach
ββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β User: "Help me understand and improve the auth module" β
ββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β
βββββββββββββββββΌββββββββββββββββ
βΌ βΌ βΌ
ββββββββββββ ββββββββββββ ββββββββββββ
β RAG β β CodeSearchβ β LLM β
β Semantic β β Structuralβ β Reasoningβ
ββββββ¬ββββββ βββββββ¬βββββ ββββββ¬ββββββ
β β β
"Auth handles "auth.rs: "Based on the
user login, fn login() L12 structure, I
sessions..." fn verify() L45 recommend..."
complexity: 18
dead code: 2
π Quick Start
# Simple search: codesearch <query> [path]
# Fuzzy search (handles typos)
# Interactive mode
# Analysis commands
β¨ Features
- Fast regex search with exact line-level precision
- Fuzzy matching for typo tolerance
- 48 language support with syntax awareness
- Interactive REPL for exploratory analysis
- Code metrics - complexity, duplication, dead code
- Export results to CSV or Markdown
- MCP server for AI agent integration
- Parallel processing for large codebases
ποΈ Installation
# With MCP server support for AI agents
π Usage Examples
Search Patterns
# codesearch <query> [path] [options]
# Regex patterns
# Fuzzy search (handles typos)
Code Analysis
# Codebase overview
# Output: Files, lines, languages, function count, class count
# Complexity analysis
# Output: Files ranked by cyclomatic/cognitive complexity
# Dead code detection
# Output: Unused imports, functions, classes
# Duplicate detection
# Output: Similar code blocks that violate DRY
Interactive Mode
Commands:
- Type any pattern to search
/f- Toggle fuzzy mode/i- Toggle case insensitivityanalyze- Codebase metricscomplexity- Complexity analysisdeadcode- Dead code detectionduplicates- Find duplicateshelp- All commands
MCP Server (AI Integration)
# Start MCP server
# Agents can call:
# - search_code(query, path, extensions, fuzzy, regex)
# - list_files(path, extensions, exclude)
# - analyze_codebase(path, extensions)
π Output Examples
Search Results
π Search Results for "fn main"
ββββββββββββββββββββββββββββββ
π ./src/main.rs (1 match)
358: fn main() -> Result<(), Box<dyn std::error::Error>> {
π Statistics:
Files searched: 12
Matches found: 1
Time: 0.003s
Dead Code Detection
π Dead Code Detection
ββββββββββββββββββββββββββββββ
β οΈ Found 5 potential dead code items:
π examples/deadcode_demo.rs
π₯ L 4: import 'HashMap' - Imported but never used
π₯ L 6: import 'Write' - Imported but never used
π Summary:
β’ import: 5
Complexity Analysis
π Code Complexity Analysis
ββββββββββββββββββββββββββββββ
π Files by Complexity (highest first):
src/search.rs
Cyclomatic: 45 Cognitive: 38 Lines: 645
src/analysis.rs
Cyclomatic: 28 Cognitive: 22 Lines: 378
π§ Architecture
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β CLI Layer β
β main.rs (358 LOC) - Argument parsing, command routing β
β interactive.rs (350 LOC) - REPL interface β
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β Core Engine β
β search.rs (645 LOC) - Pattern matching, fuzzy, ranking β
β language.rs (509 LOC) - 48 language definitions β
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β Analysis Layer β
β analysis.rs (378 LOC) - Codebase metrics β
β complexity.rs (308 LOC) - Cyclomatic/cognitive complexity β
β deadcode.rs (373 LOC) - Unused code detection β
β duplicates.rs (196 LOC) - Similar code detection β
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β Integration Layer β
β mcp_server.rs (295 LOC) - MCP protocol for AI agents β
β export.rs (185 LOC) - CSV/Markdown output β
β cache.rs (127 LOC) - Result caching β
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
11 modules, ~3,800 lines of Rust code
π§ͺ Testing
# Run all tests (84 total)
# Unit tests: 35 (core functionality)
# Integration tests: 26 (CLI commands)
# MCP tests: 23 (server tools)
β‘ Performance
- 10x faster than grep for complex patterns
- Parallel processing with rayon
- Memory efficient streaming for large files
- Compiled regex patterns cached
- Smart defaults exclude build directories
π Documentation
- README.md - This guide
- ARCHITECTURE.md - System design
- TODO.md - Roadmap
- examples/ - Code samples with dead code demos
π€ Contributing
- Fork the repository
- Create a feature branch
- Test:
cargo test --features mcp - Submit a pull request
π License
Apache-2.0 License
Built with β€οΈ in Rust | Precise | Fast | Agent-Ready
"RAG tells you about code. Code Search shows you the code."