# Loregrep Python Package
**Fast code analysis tools for AI coding assistants with complete file path tracking**
Loregrep is a high-performance repository indexing library that uses tree-sitter parsing to analyze codebases. It provides 6 standardized tools that supply structured code data to AI systems like Claude, GPT, and other coding assistants.
## ✨ **What's New in v0.4.2: Enhanced User Experience**
**Transform from complex setup to delightful developer experience:**
- 🎯 **Real-time feedback**: See analyzer registration and validation immediately
- 📊 **Comprehensive summaries**: Detailed scan results with performance metrics
- 🔧 **Actionable errors**: Clear guidance when something goes wrong
- 🌟 **Professional polish**: Emoji indicators and user-friendly messages
- 📁 **Complete file path tracking**: Every code element includes its originating file
### Before vs After
```python
# Before: Silent, unclear feedback
lg = loregrep.LoreGrep.builder().build()
# After: Rich, helpful feedback
lg = (loregrep.LoreGrep.builder()
.with_rust_analyzer() # ✅ Rust analyzer registered successfully
.with_python_analyzer() # ✅ Python analyzer registered successfully
.build()) # 🎆 LoreGrep configured with 2 languages
# Enhanced scan with comprehensive feedback
result = await lg.scan("./my-project")
# 🔍 Starting scan... 📁 Found X files... 📊 Summary with metrics
```
## 🎯 **Enhanced with File Path Storage for Better AI Integration**
**Every code element now includes its originating file path** - functions, structs, imports, exports, and function calls all include complete file location information for superior cross-file reference tracking.
```python
# Example: Functions now include file paths
functions = await lg.execute_tool("search_functions", {"pattern": "config"})
# Returns: {
# "functions": [
# {
# "name": "parse_config",
# "file_path": "src/config.py", # ← Always included
# "start_line": 45,
# "signature": "def parse_config(path: str) -> Config"
# }
# ]
# }
```
**Why This Matters for AI Assistants:**
- **Cross-file Navigation**: AI can track where functions are defined vs. called
- **Impact Analysis**: Understand which files are affected by changes
- **Module Awareness**: AI knows the context and organization of your code
- **Refactoring Safety**: Track all references across the entire codebase
[](https://badge.fury.io/py/loregrep)
[](https://www.python.org/downloads/)
## Quick Start
### Installation
```bash
pip install loregrep
```
### Basic Usage
```python
import asyncio
import loregrep
async def analyze_repository():
# Easiest way: Zero-configuration auto-discovery (Recommended)
lg = loregrep.LoreGrep.auto_discover(".")
# 🔍 Detected project languages: rust, python
# ✅ Rust analyzer registered successfully
# ✅ Python analyzer registered successfully
# Alternative: Enhanced builder for fine control
# lg = (loregrep.LoreGrep.builder()
# .with_rust_analyzer() # ✅ Real-time feedback
# .with_python_analyzer() # ✅ Registration confirmation
# .optimize_for_performance() # 🚀 Speed-optimized preset
# .exclude_test_dirs() # 🚫 Skip test directories
# .max_file_size(1024 * 1024) # 1MB limit
# .build()) # 🎆 Configuration summary
# Alternative: Project-specific presets
# lg = loregrep.LoreGrep.rust_project(".") # Rust-optimized
# lg = loregrep.LoreGrep.python_project(".") # Python-optimized
# lg = loregrep.LoreGrep.polyglot_project(".") # Multi-language
# Scan your repository with enhanced feedback
result = await lg.scan("/path/to/your/project")
# 🔍 Starting repository scan... 📁 Found X files... 📊 Scan Summary
print(f"📁 Scanned {result.files_scanned} files")
print(f"🔧 Found {result.functions_found} functions")
print(f"📦 Found {result.structs_found} structures")
# Search for functions
functions = await lg.execute_tool("search_functions", {
"pattern": "auth",
"limit": 10
})
print("🔍 Authentication functions:")
print(functions.content)
# Get repository overview
overview = await lg.execute_tool("get_repository_tree", {
"include_file_details": True,
"max_depth": 2
})
print("🌳 Repository structure:")
print(overview.content)
# Run the analysis
asyncio.run(analyze_repository())
```
## AI Integration
Loregrep provides 6 standardized tools that supply structured code data to AI coding assistants:
### Available Tools
```python
# Get all available tools
tools = loregrep.LoreGrep.get_tool_definitions()
for tool in tools:
print(f"🛠️ {tool.name}: {tool.description}")
```
#### 1. **search_functions** - Find functions by pattern (with file paths)
```python
result = await lg.execute_tool("search_functions", {
"pattern": "config",
"limit": 20
})
# Example response with file path information:
# {
# "functions": [
# {
# "name": "parse_config",
# "file_path": "src/config.py",
# "start_line": 45,
# "signature": "def parse_config(path: str) -> Config",
# "is_async": False
# },
# {
# "name": "load_config",
# "file_path": "src/utils/loader.py", # Different file!
# "start_line": 12,
# "signature": "async def load_config() -> Config",
# "is_async": True
# }
# ]
# }
```
#### 2. **search_structs** - Find classes/structures by pattern
```python
result = await lg.execute_tool("search_structs", {
"pattern": "User",
"limit": 10
})
```
#### 3. **analyze_file** - Detailed analysis of specific files
```python
result = await lg.execute_tool("analyze_file", {
"file_path": "src/main.py",
"include_source": False
})
```
#### 4. **get_dependencies** - Find imports and exports
```python
result = await lg.execute_tool("get_dependencies", {
"file_path": "src/utils.py"
})
```
#### 5. **find_callers** - Locate function call sites (cross-file tracking)
```python
result = await lg.execute_tool("find_callers", {
"function_name": "authenticate_user"
})
# Example response showing calls across multiple files:
# {
# "callers": [
# {
# "function_name": "authenticate_user",
# "file_path": "src/api/auth.py", # Called from API module
# "line_number": 23,
# "caller_function": "login_endpoint"
# },
# {
# "function_name": "authenticate_user",
# "file_path": "src/middleware/auth.py", # Also called from middleware
# "line_number": 45,
# "caller_function": "auth_middleware"
# },
# {
# "function_name": "authenticate_user",
# "file_path": "tests/test_auth.py", # And from tests
# "line_number": 67,
# "caller_function": "test_valid_user"
# }
# ]
# }
```
#### 6. **get_repository_tree** - Repository structure overview
```python
result = await lg.execute_tool("get_repository_tree", {
"include_file_details": True,
"max_depth": 3
})
```
## Configuration Options
### Enhanced Builder Pattern with Convenience Methods
```python
# Performance-optimized configuration
fast_loregrep = (loregrep.LoreGrep.builder()
.with_rust_analyzer() # ✅ Analyzer registration feedback
.optimize_for_performance() # 🚀 512KB limit, depth 8, skip binaries
.exclude_test_dirs() # 🚫 Skip test directories
.exclude_vendor_dirs() # 🚫 Skip vendor/dependencies
.build()) # 🎆 Configuration summary
# Comprehensive analysis configuration
thorough_loregrep = (loregrep.LoreGrep.builder()
.with_all_analyzers() # ✅ All available language analyzers
.comprehensive_analysis() # 🔍 5MB limit, depth 20, more file types
.include_config_files() # ✅ Include TOML, JSON, YAML configs
.build())
# Traditional manual configuration (still supported)
manual_loregrep = (loregrep.LoreGrep.builder()
.max_file_size(2 * 1024 * 1024) # 2MB file size limit
.max_depth(15) # Max directory depth
.file_patterns(["*.py", "*.js", "*.ts", "*.rs"])
.exclude_patterns([
"node_modules/", "__pycache__/", "target/",
".git/", "venv/", ".env/"
])
.respect_gitignore(True) # Honor .gitignore files
.build())
```
### Scan Results
```python
result = await lg.scan("/path/to/repo")
# Access scan statistics
print(f"Files scanned: {result.files_scanned}")
print(f"Functions found: {result.functions_found}")
print(f"Structs found: {result.structs_found}")
print(f"Duration: {result.duration_ms}ms")
print(f"Errors: {result.errors}") # List of any scan errors
```
## How It Works
### Core Technology
- **Tree-sitter parsing**: Fast, accurate syntax analysis (not AI)
- **In-memory indexing**: Quick lookups and search (not AI)
- **Structured data extraction**: Functions, classes, imports, etc. (not AI)
### AI Integration
- **Tool interface**: 6 standardized tools that provide data **to** AI systems
- **AI assistants**: Use the structured data to answer questions about your code
- **LLM compatibility**: Works with Claude [Other integrations planned..]
*Loregrep does the fast parsing and indexing - AI systems use that data to understand your code.*
## Language Support
| Rust | ✅ Full | ✅ | ✅ | ✅ |
| Python | ✅ Full | ✅ | ✅ | ✅ |
| TypeScript | 🚧 Planned | - | - | - |
| JavaScript | 🚧 Planned | - | - | - |
*Additional language support coming soon*
## Error Handling
```python
try:
result = await lg.scan("/invalid/path")
except OSError as e:
print(f"Path error: {e}")
except RuntimeError as e:
print(f"Analysis error: {e}")
except ValueError as e:
print(f"Configuration error: {e}")
```
## Async and Threading
Loregrep is fully async and thread-safe:
```python
import asyncio
# Multiple concurrent operations
async def parallel_analysis():
lg1 = loregrep.LoreGrep.auto_discover("/project1")
lg2 = loregrep.LoreGrep.auto_discover("/project2")
# Concurrent scanning
results = await asyncio.gather(
lg1.scan("/project1"),
lg2.scan("/project2")
)
# Concurrent tool execution
analyses = await asyncio.gather(
lg1.execute_tool("search_functions", {"pattern": "api"}),
lg2.execute_tool("get_repository_tree", {"max_depth": 2})
)
```
## Integration with AI Assistants
### Claude/OpenAI Integration
```python
# Get tool schemas for AI systems
tools = loregrep.LoreGrep.get_tool_definitions()
# Send to Claude/OpenAI as available tools
# When AI calls a tool, execute it:
result = await lg.execute_tool(tool_name, tool_args)
# Send result back to AI
ai_response = send_to_ai(result.content)
```
### Example: Enhanced Code Analysis Bot with File Path Awareness
```python
import json
from typing import List, Dict
async def code_analysis_bot(user_question: str, repo_path: str):
lg = loregrep.LoreGrep.auto_discover(repo_path)
await lg.scan(repo_path)
if "functions" in user_question.lower():
result = await lg.execute_tool("search_functions", {
"pattern": extract_pattern(user_question),
"limit": 10
})
# AI can now understand file organization
response_data = json.loads(result.content)
functions_by_file = {}
for func in response_data.get("functions", []):
file_path = func["file_path"]
if file_path not in functions_by_file:
functions_by_file[file_path] = []
functions_by_file[file_path].append(func["name"])
return f"Found functions across {len(functions_by_file)} files: {functions_by_file}"
elif "impact" in user_question.lower():
# Find potential impact of changing a function
function_name = extract_function_name(user_question)
# Find where it's defined
definitions = await lg.execute_tool("search_functions", {
"pattern": function_name,
"limit": 1
})
# Find where it's called
callers = await lg.execute_tool("find_callers", {
"function_name": function_name
})
# Analyze impact across files
def_data = json.loads(definitions.content)
call_data = json.loads(callers.content)
affected_files = set()
if def_data.get("functions"):
affected_files.add(def_data["functions"][0]["file_path"])
for caller in call_data.get("callers", []):
affected_files.add(caller["file_path"])
return f"Changing '{function_name}' would affect {len(affected_files)} files: {list(affected_files)}"
elif "structure" in user_question.lower():
result = await lg.execute_tool("get_repository_tree", {
"include_file_details": True,
"max_depth": 2
})
return f"Repository structure: {result.content}"
return "I can help analyze functions, impact, or structure. Please be more specific!"
# Helper functions
def extract_pattern(question: str) -> str:
# Extract search pattern from user question
# Implementation depends on your NLP approach
pass
def extract_function_name(question: str) -> str:
# Extract function name from user question
# Implementation depends on your NLP approach
pass
```
## Requirements
- Python 3.7+
- No external dependencies (uses native Rust extensions)
## Examples
See the [examples directory](https://github.com/your-repo/loregrep/tree/main/python/examples) for complete working examples:
- [`basic_usage.py`](https://github.com/your-repo/loregrep/blob/main/python/examples/basic_usage.py) - Complete workflow demonstration
- [`test_bindings.py`](https://github.com/your-repo/loregrep/blob/main/python/examples/test_bindings.py) - Comprehensive test suite
## License
Licensed under either of Apache License, Version 2.0 or MIT license at your option.
## Contributing
Contributions are welcome! Please see our [contribution guidelines](https://github.com/your-repo/loregrep/blob/main/CONTRIBUTING.md).