Grep.app Haystack
Grep.app integration for Terraphim AI, enabling code search across millions of GitHub repositories.
Overview
This crate provides a haystack provider for grep.app, a code search engine by Vercel that indexes millions of public GitHub repositories. It allows you to search for code patterns, functions, and implementations across a massive codebase directly from Terraphim AI.
Features
- Fast Code Search: Search across 500,000+ GitHub repositories
- Language Filtering: Filter results by programming language (Rust, Python, JavaScript, etc.)
- Repository Filtering: Narrow searches to specific repositories (e.g., "tokio-rs/tokio")
- Path Filtering: Search within specific directories
- Rate Limiting: Automatic handling of API rate limits
- Error Handling: Graceful degradation on failures
Installation
Add to your Cargo.toml:
[]
= { = "../haystack_grepapp" }
= { = "../haystack_core" }
= { = "../terraphim_types" }
Usage
Basic Search
use GrepAppHaystack;
use HaystackProvider;
use SearchQuery;
async
Search with Filters
use GrepAppHaystack;
async
Using the Low-Level Client
use ;
async
Configuration
Terraphim Role Configuration
Add grep.app as a haystack in your role configuration:
API Parameters
query(required): Search query string (max 1000 characters)language(optional): Programming language filter (e.g., "Rust", "Python", "JavaScript")repo(optional): Repository filter in "owner/repo" format (e.g., "tokio-rs/tokio")path(optional): Path filter for directory-specific searches (e.g., "src/")
Response Format
Each search result is converted to a Document with:
id: Unique identifier (format:repo:branch:path)url: GitHub blob URL to the filetitle: Formatted as "repo - filename"body: Code snippet with matches (HTML tags stripped)description: Human-readable descriptiontags: Repository name and filename
Error Handling
The client handles various error conditions:
- Rate Limiting (429): Returns error with message "Rate limit exceeded"
- No Results (404): Returns empty vector instead of error
- Network Errors: Propagates with context
- Invalid Queries: Validates query length and emptiness
Testing
Run the test suite:
# Run all tests
# Run with output
# Run specific test
Examples
Search for Error Handling Patterns
let haystack = with_filters?;
let query = SearchQuery ;
let documents = haystack.search.await?;
Find Specific Function Implementations
let haystack = with_filters?;
let query = SearchQuery ;
let documents = haystack.search.await?;
Limitations
- Rate Limits: grep.app enforces rate limits on API requests
- No Authentication: grep.app API currently doesn't require authentication
- Public Repositories Only: Only searches public GitHub repositories
- No Regex Support: Search is text-based, not regex-based (though grep.app may support some patterns)
API Reference
grep.app uses the following API endpoint:
- Endpoint:
https://grep.app/api/search - Method: GET
- Parameters:
q,f.lang,f.repo,f.path - Response: JSON with
facetsandhits
For more details, see the models.rs file for the complete response structure.
Contributing
When extending this crate:
- Add tests for new functionality
- Update this README with new features
- Follow Rust naming conventions (snake_case)
- Use
tracingfor logging, notprintln!
License
MIT
Links
- grep.app - Official website
- Terraphim AI - Main project