Greppy
██████╗ ██████╗ ███████╗██████╗ ██████╗ ██╗ ██╗
██╔════╝ ██╔══██╗██╔════╝██╔══██╗██╔══██╗╚██╗ ██╔╝
██║ ███╗██████╔╝█████╗ ██████╔╝██████╔╝ ╚████╔╝
██║ ██║██╔══██╗██╔══╝ ██╔═══╝ ██╔═══╝ ╚██╔╝
╚██████╔╝██║ ██║███████╗██║ ██║ ██║
╚═════╝ ╚═╝ ╚═╝╚══════╝╚═╝ ╚═╝ ╚═╝
Sub-millisecond semantic code search and invocation tracing with AI-powered reranking.
No cloud indexing. Works with Ollama (local), Claude, or Gemini. Just greppy search "query" or greppy trace symbol.
What is Greppy?
Greppy is a local code search tool that combines:
- BM25 full-text search via Tantivy for sub-millisecond queries
- AI reranking via Ollama (local), Claude, or Gemini to surface the most relevant results
- Background daemon with file watching for instant, always-up-to-date searches
Why Greppy?
AI coding tools (Claude Code, Cursor, Aider, OpenCode) need fast code search. Existing solutions are either:
- Too slow - grep/ripgrep scan files on every query
- Cloud-dependent - Sourcegraph, GitHub search require network
- Not semantic - keyword matching misses context
Greppy gives you <1ms semantic search that runs entirely on your machine.
Installation
macOS / Linux
|
Windows (PowerShell)
irm https://raw.githubusercontent.com/KBLCode/greppy/main/install.ps1 | iex
Cargo
From Source
Quick Start
# 1. Index your project (one-time setup)
# 2. (Optional) Authenticate for AI-powered reranking
# 3. Search!
That's it! Greppy works immediately after indexing. Authentication is optional but recommended for better results.
Search Modes
Semantic Search (Default)
When configured with an AI provider, Greppy:
- Runs a fast BM25 search to find candidate results
- Sends candidates to AI (Ollama local, Claude, or Gemini) for reranking
- Returns results ordered by semantic relevance
Without AI configured, automatically falls back to direct BM25 mode.
Direct Search (BM25 Only)
Pure BM25 search without AI. Faster, but results are ranked by keyword frequency rather than semantic relevance.
Search Options
Usage: greppy search [OPTIONS] <QUERY>
Options:
-d, --direct Direct mode (BM25 only, no AI)
-n, --limit <N> Maximum results (default: 20)
--json JSON output for scripting
-p, --project <PATH> Project path (default: current directory)
Examples
# Find authentication code
# Find all TODOs (direct mode, faster)
# JSON output for scripting
|
# Search a specific project
Trace (Invocation Mapping)
Greppy Trace provides complete codebase invocation mapping - like Sentry's stack traces, but for your entire codebase without running code.
Basic Trace
# Find all invocation paths for a symbol
Output:
╔══════════════════════════════════════════════════════════════════════════════╗
║ TRACE: validateUser ║
║ Defined: utils/validation.ts:8 ║
║ Found: 47 invocation paths from 12 entry points ║
╚══════════════════════════════════════════════════════════════════════════════╝
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━
Path 1/47 POST /api/auth/login
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━
routes.ts:15 → POST /api/auth/login
│
auth.controller.ts:8 → loginController.handle(req, res)
│
auth.service.ts:42 → authService.login(credentials)
│
validation.ts:8 → validateUser(user) ← TARGET
Trace Commands
# Call graph trace (who calls this function)
# Direct mode (no AI, sub-millisecond)
# Reference tracing with code context
# Call graph analysis
# Type tracing (where does this type flow)
# Module tracing (import/export relationships)
# Pattern tracing (find any pattern with regex)
# Data flow analysis
# Impact analysis (what breaks if I change this)
# Dead code detection
# Codebase statistics
# Scope analysis
# Output formats
Composable Operations
Run multiple analyses in a single command:
# Run dead code + stats + cycles together
# Filter all operations to a path
# Summary mode: one-line output per operation
# Combined JSON output for tooling
Summary mode output:
DEAD CODE ANALYSIS
Dead symbols: 61 (unknown=4, function=16, struct=41)
CODEBASE STATISTICS
Files: 5 Symbols: 84 Refs: 1711 Edges: 1688
CIRCULAR DEPENDENCIES
Circular deps: 2
Cross-Reference Dead Code
The --xref flag shows potential callers for dead symbols:
Output:
MessageRequest src/ai/claude.rs:17 No references or calls found
Potential callers:
→ new src/ai/claude.rs:66 Same file - could call this
→ get_access_token src/ai/claude.rs:75 Same file - could call this
→ MessageRequest src/ai/claude.rs:143 Token match - name appears here
This helps you understand why code is dead - is it truly unused, or is there a missing call?
What grep/ripgrep CAN'T do (but greppy can)
| Feature | grep/ripgrep | greppy |
|---|---|---|
| Impact analysis | No | --impact shows callers & affected entry points |
| Dead code detection | No | --dead finds unused symbols |
| Dead code cross-reference | No | --dead --xref shows potential callers |
| Call chain visualization | No | Shows full invocation paths |
| Semantic reference filtering | No | --reads vs --writes vs --kind call |
| Codebase statistics | No | --stats shows symbols, call depth, etc. |
| Circular dependency detection | No | --cycles finds import loops |
| Composable operations | No | --dead --stats --cycles runs all at once |
| Summary mode | No | --summary for condensed output |
Authentication
Greppy uses OAuth to authenticate with AI providers. No API keys needed!
Login
-
Select your provider using arrow keys:
- Claude (Anthropic) - Uses your Claude.ai account
- Gemini (Google) - Uses your Google account
-
Complete the OAuth flow in your browser
-
You're ready to use semantic search!
Logout
Removes all stored credentials from your system keychain.
How It Works
- Tokens are stored securely in your system keychain (macOS Keychain, Windows Credential Manager, Linux Secret Service)
- Uses OAuth free tier - no API billing
- Without authentication, searches fall back to direct BM25 mode automatically
Daemon
The background daemon provides sub-millisecond queries and automatic index updates.
Commands
Features
- In-memory indexes - Queries return in <1ms
- File watching - Automatically updates indexes when files change
- Query caching - Repeated queries are instant
Platform Support
| Platform | IPC Method |
|---|---|
| macOS | Unix socket (~/.greppy/daemon.sock) |
| Linux | Unix socket (~/.greppy/daemon.sock) |
| Windows | TCP localhost (port in ~/.greppy/daemon.port) |
Indexing
Basic Usage
# Index current directory
# Index specific project
# Force full re-index
What Gets Indexed
Greppy automatically:
- Respects
.gitignorepatterns - Chunks code into semantic units (functions, classes, methods)
- Extracts symbol names for boosted matching
- Skips binary files and common non-code directories
Supported Languages
TypeScript, JavaScript, Python, Rust, Go, Java, Kotlin, Ruby, PHP, C, C++, C#, Swift, Elixir, Haskell, Lua, Shell, SQL, Vue, Svelte, HTML, CSS, JSON, YAML, Markdown, and more.
Performance
Search Performance
| Mode | Latency | Notes |
|---|---|---|
| Daemon (warm) | <1ms | Index in memory |
| Direct (warm) | 1-10ms | Index on disk |
| Direct (cold) | 50-100ms | First query loads index |
| Semantic (AI) | 500-2000ms | Includes AI reranking |
Benchmark: greppy vs grep vs ripgrep
Tested on a 75k file, 13.7M line TypeScript codebase:
| Query: "userId" | Results | Time | Notes |
|---|---|---|---|
| grep | 2,648 | ~2.5s | Text matching (scans all files) |
| ripgrep | 1,296 | ~0.04s | Text matching (parallel, faster) |
| greppy | 990 | ~0.07s | Semantic refs (knows symbol context) |
| Query: "useState" | Results | Time | Notes |
|---|---|---|---|
| grep | 1,449 | ~2.6s | Includes comments, strings |
| ripgrep | 1,292 | ~0.04s | Includes comments, strings |
| greppy | 1,258 | ~0.08s | Only actual symbol references |
Key difference: grep/ripgrep find text matches. Greppy finds semantic symbol references - it knows when userId is a variable vs a string vs a comment.
Trace Performance
| Query Type | Time | Notes |
|---|---|---|
| Symbol references | ~70ms | All usages of a symbol |
| Impact analysis | ~75ms | What breaks if you change this |
| Dead code detection | ~78ms | Find unused symbols |
| Codebase statistics | ~600ms | Full analysis |
| Call chain trace | <1ms | Pre-computed call graph |
Token Usage: greppy vs AI Reading Files
When AI tools search code, they typically read entire files. Greppy returns only semantic references with targeted context, dramatically reducing token usage.
Real test on 75k file codebase:
| Query: "userId" (262 files contain it) | Tokens | Savings |
|---|---|---|
| AI reads 20 matching files | 43,493 | baseline |
| greppy --refs -c 2 (50 refs + context) | 3,100 | 93% less |
| Query: "validateFounderAccess" | Tokens | Savings |
|---|---|---|
| AI reads 4 matching files | 7,659 | baseline |
| greppy --refs -c 2 | 532 | 93% less |
| greppy --impact | 170 | 98% less |
Cost savings at $3/1M tokens (Claude):
- Reading 20 files: $0.13 per query
- Using greppy: $0.009 per query
- 14x cost reduction
System Performance
Indexing speed: ~17,000 chunks/second
Memory usage: ~55MB during indexing
Configuration
Optional config at ~/.greppy/config.toml:
[]
= 20
[]
= ["node_modules", ".git", "dist", "build", "__pycache__"]
[]
= 1048576 # 1MB
= 100000
[]
= 60
= 1000
Environment Variables
| Variable | Description |
|---|---|
GREPPY_HOME |
Override config/data directory (default: ~/.greppy) |
GREPPY_LOG |
Log level: debug, info, warn, error |
How It Works
-
Indexing - Greppy walks your project, respecting
.gitignore, and chunks code into semantic units (functions, classes, methods) -
Storage - Chunks are stored in a Tantivy index with BM25 ranking
-
Search - Queries are parsed and matched against the index with symbol name boosting
-
AI Reranking - When authenticated, top BM25 results are sent to Claude or Gemini for semantic reranking
-
Watching - The daemon monitors file changes and incrementally updates indexes
Integration with AI Tools
Greppy works great with AI coding assistants:
- Claude Code - Use as a code search tool
- OpenCode - Integrate via CLI
- Cursor - Call from terminal
- Aider - Use for codebase exploration
- Custom MCP servers - JSON output for easy parsing
JSON Output
Troubleshooting
"Not logged in" message
This is informational, not an error. Without authentication, Greppy uses direct BM25 search which still works great for most queries.
To enable AI reranking:
Daemon won't start
Check if another instance is running:
Index seems outdated
Force a full re-index:
Or start the daemon for automatic updates:
OAuth login fails
- Make sure you have a browser available
- Check your internet connection
- Try logging out and back in:
Web UI
Greppy includes a visual web dashboard for codebase exploration.
Launch
Features
- Multiple Views - Stats, Graph, List, Tree, Tables, Cycles, Timeline
- Interactive Charts - Matrix heatmap, Sankey flow, Force-directed graph
- Live Updates - Real-time sync when files change (via daemon)
- Symbol Details - Click any symbol to see callers, callees, refs
- Dead Code Highlighting - Instantly spot unused code
- Cycle Detection - Visualize circular dependencies
Streamer Mode
For livestreamers and screen sharing, Greppy includes a Streamer Mode that hides sensitive paths:
- Open Settings (gear icon)
- Enable "Streamer Mode"
- Configure hidden patterns (defaults:
.env*,*secret*,*credential*, etc.)
When enabled:
- Sensitive file paths are replaced with
[HIDDEN] - Redaction happens server-side (not visible in network requests)
- Visual banner indicates streamer mode is active
Views
| View | Description |
|---|---|
| Stats | Overview dashboard with charts |
| Graph | Force-directed dependency graph |
| List | Sortable/filterable symbol table |
| Tree | File tree with symbol counts |
| Tables | Matrix heatmap of file dependencies |
| Cycles | Circular dependency visualization |
| Timeline | Index history and snapshots |
License
MIT
Links
- Repository: https://github.com/KBLCode/greppy
- Issues: https://github.com/KBLCode/greppy/issues
- Releases: https://github.com/KBLCode/greppy/releases