CodeSearch
Fast, intelligent code search and analysis for 48+ programming languages.
Find what you need in seconds: functions, classes, duplicates, dead code, complexity issues.
Why CodeSearch?
Stop Wasting Time Searching Code
Problem: You're working in a large codebase and need to:
- Find where authentication logic is implemented
- Identify all usages of a deprecated function before refactoring
- Track down technical debt (TODOs, FIXMEs) scattered across files
- Understand complex function relationships and dependencies
- Find duplicated code that violates DRY principles
- Spot overly complex functions that need refactoring
Traditional tools fall short:
grepis slow and doesn't understand code structure- IDE search is limited to single projects/languages
- Manual code review is time-consuming and error-prone
CodeSearch solves these problems:
# Find authentication logic instantly
# Result: All authentication code, ranked by relevance
# Track technical debt before sprint planning
# Result: 15 unused functions, 23 TODOs, 8 unreachable blocks
# Find duplicates before they become maintenance nightmares
# Result: 12 code clones (80%+ similarity)
What Makes CodeSearch Different?
Unlike Joern (CPG graph DB, Scala queries, security research) or CodeQL (QL logic language, GitHub extractors, path queries), CodeSearch needs no indexing or config—and beats them on time-to-result:
- Unified graph (
codesearch graph unified) — AST+CFG+DFG in one, like Joern's CPG but no DB - Data-flow trace (
codesearch flow <var>) — path-style tracing without extractors - Security scan (
codesearch security) — eval/exec/SQL patterns instantly - First result — sub-second vs. minutes of import/build
See docs/CAPABILITY_REDESIGN.md for full comparison.
| Feature | Benefit | Example |
|---|---|---|
| Language-Aware | Understands functions, classes, imports in 48+ languages | Find fn main in Rust, def main in Python |
| Lightning Fast | Parallel processing with Rust, typical searches in 3-50ms | Search 1000 files in < 50ms |
| Intelligent | Fuzzy matching handles typos, semantic search understands context | codesearch "authetication" finds "authentication" |
| Code Quality | Detects dead code, duplicates, complexity issues automatically | codesearch complexity flags functions needing refactoring |
| Graph Analysis | 6 types of graphs for deep code understanding | Call graphs show function relationships |
| Developer-Friendly | Interactive mode, multiple export formats, MCP for AI agents | codesearch interactive for REPL-style search |
Real-World Impact
- Save Hours per Week: Replace manual code hunting with instant searches
- Ship Better Code: Catch dead code and complexity issues before review
- Understand Faster: Visualize code relationships with graph analysis
- Reduce Technical Debt: Track and eliminate code quality issues systematically
Quick Start
Installation
# Clone and build
# The binary will be at: ./target/release/codesearch
# Optional: Add to PATH
Basic Usage
# Simple search - find anything in your codebase
# Search with file type filter
# Fuzzy search (handles typos!)
# Interactive mode
Usage Examples
1. Everyday Search Tasks
Find Function Definitions
# Find all functions named "process"
# Find class definitions
# Find async functions
Track Technical Debt
# Find all TODOs and FIXMEs
# Export to CSV for tracking
Refactor Safely
# Find all usages before refactoring
# Case-sensitive search for exact matches
2. Code Quality Analysis
Health Score
# Get overall code health score
# Output:
# 🏥 Code Health Report
# Overall Health Score: 85/100 ✅
#
# Components:
# • Dead Code: 90/100 (3 issues)
# • Duplicates: 95/100 (2 duplicates)
# • Complexity: 70/100 (5 high-complexity functions)
# CI/CD integration with fail threshold
Detect Dead Code
# Find unused code
# Output:
# ⚠️ Found 12 potential dead code items:
# [var] L 10: variable 'unused_var'
# [∅] L 42: empty_helper()
# [?] L 58: TODO marker
# [!] L 72: unreachable code
# Export for code review
Analyze Complexity
# Find complex functions that need refactoring
# Output:
# 📊 Files by Complexity:
# src/auth.rs: Cyclomatic 45, Cognitive 38 (HIGH)
# src/parser.rs: Cyclomatic 28, Cognitive 22 (MEDIUM)
# Comprehensive code metrics
Find Code Duplicates
# Identify copy-pasted code
# Output:
# 🔍 Found 8 duplicate code blocks:
# auth.rs:120-145 vs user.rs:89-114 (85% similar)
3. Understanding Codebases
Codebase Overview
# Get high-level metrics
# Output:
# Overview
# Total files: 156
# Total lines: 45,230
# Languages: Rust (60%), Python (25%), TypeScript (15%)
# Functions: 892
# Classes: 124
Explore Function Relationships
# Generate call graph
# Output:
# Call Graph Analysis:
# Functions: 28
# Function calls: 156
# Recursive: authenticate()
# Dead (never called): legacy_auth()
Control Flow Analysis
# Understand execution paths
# Shows: Basic blocks, branches, loops, unreachable code
Unified Graph (CPG-style, no DB)
# AST + CFG + DFG in one—like Joern's CPG, instant
# Output: Syntax edges, execution edges, data edges
Data-Flow Trace (path-query style)
# Trace variable flow—no extractors, no indexing
Security Pattern Scan
# Instant security checks—eval, exec, SQL concat, etc.
4. Advanced Workflows
Interactive Mode
# Commands in interactive mode:
# authenticate - Search for "authenticate"
# /f - Toggle fuzzy matching
# /i - Toggle case sensitivity
# analyze - Show codebase metrics
# complexity - Show complexity analysis
# deadcode - Find dead code
# help - Show all commands
Search with Ranking
# Get results ranked by relevance
# Best match first:
# src/auth/mod.rs:10 - pub fn authenticate() { ... }
# src/user.rs:45 - fn check_auth() { ... }
Export Results
# CSV for spreadsheets
# Markdown for documentation
# JSON for automation
|
5. Special Features
Search Git History
# Search through commit history
Search Remote Repositories
# Search GitHub/GitLab without cloning
Build Index for Large Codebases
# Incremental indexing for faster searches
# Watch for changes and auto-update
Common Commands Reference
Search Commands
# Options:
Analysis Commands
Graph Commands
Utility Commands
Real-World Examples
Example 1: Pre-Code Review Checklist
#!/bin/bash
# review.sh - Automated code review checklist
Example 2: Learning a New Codebase
# Step 1: Understand the structure
# Step 2: Find entry points
# Step 3: Explore key modules
|
# Step 4: Understand function relationships
|
# Step 5: Find complex code to review
|
Example 3: Refactoring Workflow
# Before refactoring, find all usages
# Check for complexity issues
# Find similar code that could be consolidated
# After refactoring, verify no old code remains
# Should return: "No matches found"
Example 4: Continuous Quality Monitoring
# Add to CI/CD pipeline
# Fail if high complexity functions detected
complexity=
max_cc=
if [; then
fi
# Fail if new dead code introduced
deadcode_count=
if [; then
fi
Demo Project
A comprehensive example project demonstrating all CodeSearch capabilities is available in the examples/demo-project/ directory.
Run the demo:
Demo includes:
- Multi-language codebase (Rust, Python, TypeScript)
- Intentional code quality issues for detection
- All analysis types demonstrated
- Real-world usage examples
Architecture & Quality
Code Quality Standards
- ✅ 100% test pass rate (173 unit + 36 integration + 23 MCP = 232 tests)
- ✅ Zero clippy warnings (clean code)
- ✅ Modular architecture (40+ focused modules)
- ✅ Thread-safe parallel processing with rayon
- ✅ Comprehensive error handling with custom error types
- ✅ Trait abstractions for extensibility and testability
Performance
- Fast: 3-50ms for typical searches (< 1000 files)
- Parallel: Auto-scales to available CPU cores
- Smart caching: 70-90% cache hit rate for repeated searches
- Memory efficient: Streaming file reading, < 100MB for 10K files
Supported Languages
Native Parsers (High Performance):
- Rust - Full AST parsing with zero-allocation tokenizer
- Python - Complete syntax support including async/await
- JavaScript/TypeScript - ES6+, JSX, TSX support
- Go - Structs, interfaces, methods with receivers
- Java - Classes, interfaces, enums, annotations
48+ Additional Languages via regex patterns including: C/C++, Ruby, PHP, Swift, Kotlin, C#, Haskell, Elixir, Erlang, Scala, Lua, Perl, Shell, SQL, YAML, TOML, JSON, and more.
See codesearch languages for the complete list.
Additional Resources
- Demo Project - Hands-on examples
- DEMO_GUIDE.md - Comprehensive usage guide
- ARCHITECTURE.md - Technical details and design
- SPEC.md - Technical specification
- CLAUDE.md - Contributor guide
License
Apache-2.0 License
Built with Rust • Fast • Precise • 48+ Languages
Version: 0.1.8