Expand description
§Loregrep: Fast Repository Indexing for Coding Assistants
Loregrep is a high-performance repository indexing library that parses codebases into fast, searchable in-memory indexes. It’s designed to provide coding assistants and AI tools with structured access to code functions, structures, dependencies, and call graphs.
§What It Does
- Parses code files using tree-sitter for accurate syntax analysis
- Indexes functions, structs, imports, exports, and relationships in memory
- Provides 6 standardized tools that coding assistants can call to query the codebase
- Enables AI systems to understand code structure without re-parsing
§What It’s NOT
- ❌ Not an AI tool itself (provides data TO AI systems)
- ❌ Not a traditional code analysis tool (no linting, metrics, complexity analysis)
§Core Architecture
┌─────────────────┐ ┌──────────────────┐ ┌─────────────────┐
│ Code Files │───▶│ Tree-sitter │───▶│ In-Memory │
│ (.rs, .py, │ │ Parsing │ │ RepoMap │
│ .ts, etc.) │ │ │ │ Indexes │
└─────────────────┘ └──────────────────┘ └─────────────────┘
│
▼
┌─────────────────┐ ┌──────────────────┐ ┌─────────────────┐
│ Coding Assistant│◀───│ 6 Query Tools │◀───│ Fast Lookups │
│ (Claude, GPT, │ │ (search, analyze,│ │ (functions, │
│ Cursor, etc.) │ │ dependencies) │ │ structs, etc.)│
└─────────────────┘ └──────────────────┘ └─────────────────┘
§Quick Start
§Zero-Configuration Auto-Discovery (Recommended)
use loregrep::LoreGrep;
#[tokio::main]
async fn main() -> Result<(), Box<dyn std::error::Error>> {
// One-line setup with automatic project detection
let mut loregrep = LoreGrep::auto_discover(".")?;
// 🔍 Detected project languages: rust, python
// ✅ Rust analyzer registered successfully
// ✅ Python analyzer registered successfully
// 📁 Configuring file patterns for detected languages
// 🎆 LoreGrep configured with 2 language(s): rust, python
// Scan with comprehensive feedback
let scan_result = loregrep.scan(".").await?;
// 🔍 Starting repository scan... 📁 Found X files... 📊 Summary
println!("Indexed {} files with {} functions",
scan_result.files_scanned,
scan_result.functions_found);
Ok(())
}
§Manual Configuration with Enhanced Builder
use loregrep::LoreGrep;
#[tokio::main]
async fn main() -> Result<(), Box<dyn std::error::Error>> {
// Full control with enhanced builder pattern
let mut loregrep = LoreGrep::builder()
.with_rust_analyzer() // ✅ Real-time feedback
.with_python_analyzer() // ✅ Registration confirmation
.optimize_for_performance() // 🚀 Speed-optimized preset
.exclude_test_dirs() // 🚫 Skip test directories
.max_file_size(1024 * 1024) // 1MB limit
.max_depth(10) // Directory depth limit
.build()?; // 🎆 Configuration summary
let scan_result = loregrep.scan("/path/to/your/repo").await?;
Ok(())
}
§Integration with Coding Assistants
The library provides 6 standardized tools that AI coding assistants can call:
use loregrep::LoreGrep;
use serde_json::json;
#[tokio::main]
async fn main() -> Result<(), Box<dyn std::error::Error>> {
// Option 1: Zero-configuration setup
let mut loregrep = LoreGrep::auto_discover(".")?;
// Auto-detects languages and configures appropriate analyzers
// Option 2: Manual setup with presets
let mut loregrep = LoreGrep::rust_project(".")?; // Rust-optimized
// Or: LoreGrep::python_project(".")? // Python-optimized
// Or: LoreGrep::polyglot_project(".")? // Multi-language
// Scan with enhanced feedback
loregrep.scan(".").await?;
// Tool 1: Search for functions (with file path information)
let result = loregrep.execute_tool("search_functions", json!({
"pattern": "parse",
"limit": 20
})).await?;
// Tool 2: Find function callers with cross-file analysis
let callers = loregrep.execute_tool("find_callers", json!({
"function_name": "parse_config"
})).await?;
// Tool 3: Analyze specific file
let analysis = loregrep.execute_tool("analyze_file", json!({
"file_path": "src/main.rs"
})).await?;
Ok(())
}
§Available Tools for AI Integration
// Get tool definitions for your AI system
let tools = LoreGrep::get_tool_definitions();
// 6 tools available:
// 1. search_functions - Find functions by name/pattern
// 2. search_structs - Find structures by name/pattern
// 3. analyze_file - Get detailed file analysis
// 4. get_dependencies - Find imports/exports for a file
// 5. find_callers - Get function call sites
// 6. get_repository_tree - Get repository structure and overview
§Architecture Overview
§Core Components
LoreGrep
: Main API facade with builder pattern configurationRepoMap
: Fast in-memory indexes with lookup optimizationRepositoryScanner
: File discovery with gitignore support- Language Analyzers: Tree-sitter based parsing (Rust complete, others on roadmap)
- Tool System: 6 standardized tools for AI integration
§Design Characteristics
- Architecture: Fast in-memory indexing with tree-sitter parsing
- Concurrency: Thread-safe with
Arc<Mutex<>>
design - Scalability: Memory usage scales linearly with codebase size
§Language Support
Language | Status | Functions | Structs | Imports | Calls |
---|---|---|---|---|---|
Rust | ✅ Full | ✅ | ✅ | ✅ | ✅ |
Python | ✅ Full | ✅ | ✅ | ✅ | ✅ |
TypeScript | 📋 Roadmap | - | - | - | - |
JavaScript | 📋 Roadmap | - | - | - | - |
Go | 📋 Roadmap | - | - | - | - |
Note: Languages marked “📋 Roadmap” are future planned additions.
§Integration Examples
§CLI Interactive Mode
# Start interactive AI-powered query session
loregrep query --interactive
# Or run a single query
loregrep query "What functions handle authentication?"
§With Claude/OpenAI
// Provide tools to your AI client
let tools = LoreGrep::get_tool_definitions();
// Send to Claude/OpenAI as available tools
// When AI calls a tool, execute it:
let result = loregrep.execute_tool(&tool_name, tool_args).await?;
§With MCP (Model Context Protocol)
// MCP server integration is planned for future releases
// Will provide standard MCP interface for tool calling
§File Watching Integration
use notify::{Watcher, RecursiveMode, watcher};
use std::sync::mpsc::channel;
use std::time::Duration;
// Watch for file changes and re-index
let (tx, rx) = channel();
let mut watcher = watcher(tx, Duration::from_secs(2))?;
watcher.watch("/path/to/repo", RecursiveMode::Recursive)?;
// Re-scan when files change
for event in rx {
if let Ok(event) = event {
loregrep.scan("/path/to/repo").await?;
}
}
§Configuration Options
§Enhanced Builder with Convenience Methods
use loregrep::LoreGrep;
// Performance-optimized configuration
let fast_loregrep = LoreGrep::builder()
.with_rust_analyzer() // ✅ Analyzer registration feedback
.optimize_for_performance() // 🚀 512KB limit, depth 8, skip binaries
.exclude_test_dirs() // 🚫 Skip test directories
.exclude_vendor_dirs() // 🚫 Skip vendor/dependencies
.build()?; // 🎆 Configuration summary
// Comprehensive analysis configuration
let thorough_loregrep = LoreGrep::builder()
.with_all_analyzers() // ✅ All available language analyzers
.comprehensive_analysis() // 🔍 5MB limit, depth 20, more file types
.include_config_files() // ✅ Include TOML, JSON, YAML configs
.build()?;
// Traditional manual configuration (still supported)
let manual_loregrep = LoreGrep::builder()
.max_file_size(2 * 1024 * 1024) // 2MB file size limit
.max_depth(15) // Max directory depth
.file_patterns(vec!["*.rs", "*.py"]) // File extensions to scan
.exclude_patterns(vec!["target/"]) // Directories to skip
.respect_gitignore(true) // Honor .gitignore files
.build()?;
§Thread Safety
All operations are thread-safe. Multiple threads can query the same LoreGrep
instance
concurrently. Scanning operations are synchronized to prevent data races.
use std::sync::Arc;
use tokio::task;
let loregrep = Arc::new(loregrep);
// Multiple concurrent queries
let handles: Vec<_> = (0..10).map(|i| {
let lg = loregrep.clone();
task::spawn(async move {
lg.execute_tool("search_functions", json!({"pattern": "test"})).await
})
}).collect();
§Error Handling
The library uses comprehensive error types for different failure modes:
use loregrep::{LoreGrep, LoreGrepError};
match loregrep.scan("/invalid/path").await {
Ok(result) => println!("Success: {:?}", result),
Err(LoreGrepError::Io(e)) => println!("IO error: {}", e),
Err(LoreGrepError::Parse(e)) => println!("Parse error: {}", e),
Err(LoreGrepError::Config(e)) => println!("Config error: {}", e),
Err(e) => println!("Other error: {}", e),
}
§Use Cases
- AI Code Assistants: Provide structured code context to LLMs
- Code Search Tools: Fast symbol and pattern searching
- Refactoring Tools: Impact analysis and dependency tracking
- Documentation Generators: Extract API surfaces automatically
- Code Quality Tools: Analyze code patterns and relationships
§Performance Notes
- Indexes are built in memory for fast access
- Scanning is parallelized across CPU cores
- Query results are cached for repeated access
- Memory usage scales linearly with codebase size
- No external dependencies required at runtime
§Future Roadmap
§Language Support
- TypeScript/JavaScript Analyzers: Support for modern JS/TS features including interfaces, types, and ES6+ syntax
- Go Analyzer: Package declarations, interfaces, and Go-specific function signatures
§Advanced Analysis Features
- Call Graph Analysis: Function call extraction and visualization across files
- Dependency Tracking: Advanced import/export analysis and impact assessment
- Incremental Updates: Smart re-indexing when files change to avoid full rescans
§Performance & Optimization
- Memory Optimization: Improved handling of large repositories with better memory management
- Query Performance: Enhanced caching and lookup optimization for faster results
- Database Persistence: Optional disk-based storage for very large codebases
§Integration & Architecture
- MCP Server Integration: Standard Model Context Protocol interface for tool calling
- Editor Integrations: VS Code, IntelliJ, and other popular editor plugins
- API Enhancements: Additional tools and query capabilities for LLM integration
Re-exports§
pub use crate::core::types::ToolSchema;
pub use crate::core::types::ToolResult;
pub use crate::core::types::ScanResult;
pub use crate::core::errors::LoreGrepError;
pub use crate::core::errors::Result;
pub use python_bindings::*;
Modules§
Structs§
- Lore
Grep - Main LoreGrep API - the primary interface for code analysis
- Lore
Grep Builder - Main LoreGrep API - the primary interface for code analysis
Constants§
- VERSION
- Current library version