Reflex
Local-first, full-text code search engine for AI coding workflows
Reflex is a blazingly fast, trigram-based code search engine designed for developers and AI coding assistants. Unlike symbol-only tools, Reflex finds every occurrence of patternsβfunction calls, variable usage, comments, and moreβwith sub-100ms query times on large codebases.
β¨ Features
- π Complete Coverage: Find every occurrence, not just symbol definitions
- β‘ Blazing Fast: Sub-100ms queries on 10k+ files via trigram indexing
- π― Symbol-Aware: Runtime tree-sitter parsing for precise symbol filtering
- π³ AST Pattern Matching: Structure-aware search with Tree-sitter queries
- π Incremental: Only reindexes changed files (blake3 hashing)
- π Multi-Language: Rust, TypeScript/JavaScript, Vue, Svelte, PHP, Python, Go, Java, C, C++
- π€ AI-Ready: Clean JSON output built for LLM tools and automation
- π HTTP API: REST API for editor plugins and external tools
- π¦ Local-First: Fully offline, all data stays on your machine
- π¨ Regex Support: Trigram-optimized regex search with pattern matching
- π Deterministic: Same query β same results (no probabilistic ranking)
π Quick Start
Installation
# Clone and build from source
# Binary will be at target/release/rfx
Basic Usage
# Index your codebase
# Full-text search (finds all occurrences)
# β Finds: function definitions + all call sites
# Symbol-only search (definitions only)
# β Finds: only the function definition
# Regex search
# Filter by language and symbol kind
# Export as JSON for AI agents
# Get index statistics
# Clear cache
π Command Reference
rfx index
Build or update the local search index.
)
)
Examples:
# Incremental index (only changed files)
# Force full reindex
# Index only Rust and TypeScript files
rfx query
Search the codebase.
)
)
)
)
)
)
)
)
Examples:
# Find all occurrences of "hello" (full-text)
# Find function definitions named "parse"
# Regex: find test functions
# Language filter: Rust files only
# File path filter: only src/ directory
# JSON output for AI tools
# Count matches
# Set custom timeout (10 seconds)
# AST pattern matching (structure-aware search)
AST Pattern Matching
Reflex supports structure-aware code search using Tree-sitter AST queries. This allows you to search for specific code structures (like functions, classes, traits) rather than just text patterns.
Important:
- AST queries require
--langto be specified - AST queries must have trigram pre-filtering (pattern text) for performance
- Query patterns must include captures using
@namesyntax
Supported Languages for AST Queries
- Rust (
rust) - TypeScript (
typescript) - JavaScript (
javascript) - PHP (
php)
S-Expression Query Syntax
AST patterns use Lisp-like S-expressions with captures to match Tree-sitter AST nodes:
Basic pattern structure:
(node_type) @capture_name Match and capture any node of this type
(node_type (child_type)) @parent Match node with specific child
(node_type field: (child)) @node Match node with named field
IMPORTANT: You must use capture syntax @name to extract matched nodes. Without captures, matches will be found but not returned.
Common AST Patterns by Language
Rust:
# Find all functions
# Find all struct definitions
# Find all enum definitions
# Find all trait definitions
# Find all impl blocks
TypeScript/JavaScript:
# Find all function declarations
# Find all class declarations
# Find all interface declarations
# Find all arrow functions
# Find all method definitions
PHP:
# Find all function definitions
# Find all class declarations
# Find all trait declarations
# Find all enum declarations (PHP 8.1+)
Advanced AST Pattern Examples
Multiple captures:
# Find functions and extract the name
# Find classes with specific body
Combining with other filters:
# AST query + file filter
# AST query + limit results
# AST query + JSON output for AI agents
How AST Queries Work
- Phase 1 - Trigram Filtering: Text pattern narrows 10,000+ files β ~10-100 candidates
- Phase 2 - AST Matching: Parse candidate files with Tree-sitter and match AST pattern
- Phase 3 - Results: Return matching code structures with symbol names and spans
Performance: AST queries add 2-224ms overhead (parsing only candidate files, not entire codebase)
Finding Available Node Types
To discover available AST node types for your language:
- Visit Tree-sitter playground: https://tree-sitter.github.io/tree-sitter/playground
- Select your language grammar
- Paste sample code to see AST structure
- Use node type names in parentheses:
(node_type)
Example node types by language:
- Rust:
function_item,struct_item,enum_item,trait_item,impl_item,mod_item,const_item,static_item - TypeScript/JavaScript:
function_declaration,class_declaration,interface_declaration,arrow_function,method_definition,variable_declarator - PHP:
function_definition,class_declaration,trait_declaration,interface_declaration,enum_declaration,method_declaration
Difference from Symbol Search
| Feature | Symbol Search (--symbols) |
AST Query (--ast) |
|---|---|---|
| Purpose | Find symbol definitions | Match specific code structures |
| Filter by | Symbol kind (function, class, etc.) | AST node patterns |
| Flexibility | Predefined kinds only | Any Tree-sitter node pattern |
| Speed | Fast (simple symbol extraction) | Slightly slower (full AST matching) |
| Use case | "Find all functions" | "Find all async functions with pub modifier" |
rfx stats
Display index statistics.
Example output:
Reflex Index Statistics
-----------------------
Total Files: 1,247
Total Size: 12.4 MB
Cache Size: 2.1 MB
Last Updated: 2025-11-03 14:32:45
Languages: Rust (842), TypeScript (305), Python (100)
rfx clear
Clear the search index.
rfx list-files
List all indexed files.
rfx serve
Start an HTTP API server for programmatic access.
)
)
API Endpoints:
-
GET /query - Search the codebase
- Query params:
q,lang,kind,limit,symbols,regex,exact,expand,file,timeout - Returns:
QueryResponseJSON with results and index status
- Query params:
-
GET /stats - Get index statistics
- Returns:
IndexStatsJSON with file counts, sizes, language breakdowns
- Returns:
-
POST /index - Trigger reindexing
- Body:
{"force": boolean, "languages": [string]} - Returns:
IndexStatsJSON after indexing completes
- Body:
-
GET /health - Health check
- Returns: "Reflex is running"
Example Usage:
# Start the server
# Query from another terminal (or use in AI tools/editor plugins)
|
# Get stats
|
# Trigger indexing
Features:
- CORS enabled for browser clients
- Supports all CLI query options via query parameters
- JSON responses compatible with AI agents and automation tools
- Synchronous indexing (returns after completion)
rfx mcp
Start as an MCP (Model Context Protocol) server for AI coding assistants like Claude Code.
What is MCP?
MCP is an open standard for connecting AI assistants to external tools and data sources. Reflex implements MCP over stdio, allowing AI coding assistants to search your codebase directly.
Configuration for Claude Code:
Add to ~/.claude/claude_code_config.json:
Available MCP Tools:
-
search_code- Full-text or symbol search- Parameters:
pattern(required),lang,kind,symbols,exact,file,limit,expand - Returns: Search results with file paths, line numbers, and context
- Parameters:
-
search_regex- Regex pattern matching with trigram optimization- Parameters:
pattern(required),lang,file,limit - Returns: Regex search results
- Parameters:
-
search_ast- Structure-aware AST pattern matching- Parameters:
pattern,ast_pattern,lang(all required),file,limit - Returns: AST query results
- Parameters:
-
index_project- Trigger reindexing- Parameters:
force(optional),languages(optional array) - Returns: Index statistics after completion
- Parameters:
Usage in Claude Code:
Once configured, Claude Code will automatically:
- Spawn
rfx mcpwhen the session starts - Expose Reflex tools for natural language queries
- Handle process lifecycle (start/stop/restart)
Example prompts:
- "Search for all async functions in this project"
- "Find usages of the
parse_treefunction" - "Show me all struct definitions in Rust files"
Why stdio MCP?
- Zero port conflicts: No network configuration needed
- Automatic lifecycle: Claude Code manages the process
- Per-session isolation: Each session gets its own subprocess
- Crash recovery: Client automatically respawns on failure
- Secure: OS-sandboxed, no network exposure
π Supported Languages
| Language | Extensions | Symbol Extraction |
|---|---|---|
| Rust | .rs |
Functions, structs, enums, traits, impls, modules, methods |
| TypeScript | .ts, .tsx, .mts, .cts |
Functions, classes, interfaces, types, enums, React components |
| JavaScript | .js, .jsx, .mjs, .cjs |
Functions, classes, constants, methods, React components |
| Vue | .vue |
Functions, constants, methods from <script> blocks |
| Svelte | .svelte |
Functions, variables, reactive declarations |
| PHP | .php |
Functions, classes, interfaces, traits, methods, namespaces, enums |
| Python | .py |
Functions, classes, methods, decorators, lambdas |
| Go | .go |
Functions, types, interfaces, methods, constants |
| Java | .java |
Classes, interfaces, enums, methods, fields, constructors |
| C | .c, .h |
Functions, structs, enums, unions, typedefs |
| C++ | .cpp, .hpp, .cxx |
Functions, classes, namespaces, templates, methods |
Note: Full-text search works on all file types regardless of parser support. Symbol filtering requires a language parser.
ποΈ Architecture
Reflex uses a trigram-based inverted index combined with runtime symbol detection:
Indexing Phase
- Extract trigrams (3-character substrings) from all files
- Build inverted index:
trigram β [file_id, line_no] - Store full file contents in memory-mapped
content.bin - No tree-sitter parsing (fast indexing)
Query Phase
- Full-text queries: Intersect trigram posting lists β verify matches
- Symbol queries: Trigrams narrow to ~10-100 candidates β parse with tree-sitter β filter symbols
- Memory-mapped I/O for instant cache access
Cache Structure (.reflex/)
.reflex/
meta.db # SQLite: file metadata, stats, config
trigrams.bin # Inverted index (memory-mapped)
content.bin # Full file contents (memory-mapped)
hashes.json # File hashes for incremental indexing
config.toml # Index settings
β‘ Performance
Reflex is the fastest structure-aware local code search tool available:
Query Performance (Real-World Benchmarks)
| Codebase | Files | Full-Text Query | Symbol Query | Regex Query |
|---|---|---|---|---|
| Reflex (small) | 96 | 5-6 ms | 581 ms | 6 ms |
| Test corpus (medium) | 100-500 | 2 ms | 944 ms | 2 ms |
| Large project | 1,000+ | 2-3 ms | 1-2 sec | 2-3 ms |
Key Insights:
- Full-text & Regex: Blazing fast (2-6ms) regardless of codebase size
- Symbol queries: Slower (500ms-2s) due to runtime tree-sitter parsing of candidate files
- Cached queries: 1ms average for repeated queries (memory-mapped index)
Indexing Performance (Release Build)
| Operation | Files | Time | Notes |
|---|---|---|---|
| Initial index | 100 | 95ms | Full trigram extraction + content store |
| Initial index | 500 | 106ms | Parallel processing with 80% CPU cores |
| Initial index | 1,000 | 104ms | Batch-flush mode for large codebases |
| Incremental | 10/100 changed | 32ms | Only rehashes changed files |
| Large file | 1000 lines | 98ms | Memory-efficient line-by-line processing |
Indexing Characteristics:
- Parallel: Uses 80% of available CPU cores by default
- Incremental: Only reindexes files with changed blake3 hashes
- Memory-efficient: Batch processing for 10k+ file codebases
- gitignore-aware: Automatically skips ignored files
π§ Configuration
Reflex respects .gitignore files automatically. Additional configuration via .reflex/config.toml:
# Example configuration (auto-generated on first index)
[]
= 10
= false
[]
= ["rust", "typescript", "python", "go", "java", "c", "cpp", "php"]
[]
= false
π€ AI Integration
Reflex outputs clean JSON for AI coding assistants:
Example JSON output:
π Use Cases
For Developers
- Code Navigation: Find all usages of a function/class
- Refactoring: Identify all call sites before renaming
- Code Review: Search for patterns across files
- Debugging: Locate where variables are used
For AI Coding Assistants
- Context Gathering: Retrieve relevant code snippets
- Symbol Lookup: Find function definitions and signatures
- Pattern Analysis: Search for architectural patterns
- Test Coverage: Find test files and assertions
For Teams
- Code Search: Local alternative to Sourcegraph
- Documentation: Find examples of API usage
- Onboarding: Explore unfamiliar codebases
- Security: Search for potential vulnerabilities
π§ͺ Testing
Reflex has 330 comprehensive tests covering all functionality:
Test Breakdown
- 261 unit tests: Core modules (cache, indexer, query, parsers, trigrams, AST)
- 42 corpus tests: Real-world code samples across all supported languages
- 17 integration tests: End-to-end workflows, multi-language support, error handling
- 10 performance tests: Indexing speed, query latency, scalability benchmarks
Test Categories
- Language parsers: 18 languages Γ 5-15 tests each = ~150 tests
- Trigram indexing: Extraction, searching, persistence, memory-mapping
- Query engine: Full-text, symbol, regex, AST pattern matching
- Cache management: SQLite persistence, incremental indexing, branch tracking
- Error handling: Corrupted cache detection, disk space validation, timeout handling
Running Tests
# Run all tests (fast, debug build)
# Run all tests with output
# Run specific test module
# Run integration tests only
# Run performance tests (release build for accurate benchmarks)
# Run corpus tests (real-world code samples)
Test Coverage
All tests pass on:
- β Linux (Ubuntu, Debian, Arch)
- β macOS (Intel & ARM)
- β Windows 10/11
- β CI/CD pipelines
Quality metrics:
- Zero failing tests
- Zero flaky tests
- Deterministic results (same input β same output)
- Fast execution (<5s for all tests in debug mode)
π€ Contributing
Contributions welcome! Reflex is built to be:
- Fast: Sub-100ms queries on large codebases
- Accurate: Complete coverage with deterministic results
- Extensible: Easy to add new language parsers
See ARCHITECTURE.md for implementation details.
π Documentation
- ARCHITECTURE.md: System design, data formats, extension guide
- CLAUDE.md: Project overview and development workflow
- .context/TODO.md: Implementation roadmap and task tracking
π£οΈ Roadmap
v1.0.0 Production Ready β
- Core Features
- Trigram-based full-text search
- Runtime symbol detection (tree-sitter)
- AST pattern matching
- Regex support with trigram optimization
- 18 language parsers (Rust, TS/JS, Vue, Svelte, PHP, Python, Go, Java, C, C++, C#, Ruby, Kotlin, Zig)
- Production Readiness
- Comprehensive testing (330 tests: 261 unit + 42 corpus + 17 integration + 10 performance)
- Disk space validation before indexing
- Corrupted cache detection and recovery
- Enhanced error messages with actionable guidance
- Cross-platform compatibility (Linux, macOS, Windows)
- Performance benchmarks in documentation
- API & Integrations
- HTTP REST API (
rfx serve) - MCP server for AI agents (
rfx mcp) - File watcher with auto-reindex (
rfx watch) - JSON output for automation
- HTTP REST API (
- Documentation
- Comprehensive README.md with examples
- ARCHITECTURE.md with system design
- CLAUDE.md for AI development workflow
- Rustdoc comments for all public APIs
Next Phase: Advanced Features
- Interactive mode (
rfx interactive) - Semantic query building (natural language to Reflex query translation)
- Graph queries (imports/exports, call graph)
- Pre-built binaries for all platforms (cargo-dist)
- crates.io publication
π License
MIT License - see LICENSE for details.
π Acknowledgments
Built with:
- tree-sitter - Incremental parsing
- rkyv - Zero-copy deserialization
- memmap2 - Memory-mapped I/O
- rusqlite - SQLite bindings
- blake3 - Fast hashing
- ignore - gitignore support
Inspired by:
- Zoekt - Trigram-based code search
- Sourcegraph - Code search for teams
- ripgrep - Fast text search
Made with β€οΈ for developers and AI coding assistants