Aurora Semantic
A local, embedded semantic search engine for source code, designed to be bundled directly inside desktop IDEs.
Pure Rust implementation using ONNX Runtime for embedding inference - no Python required.
Features
- Hybrid Search - Combines lexical (keyword) and semantic (AI) search
- Fast Indexing - Parallel file processing with progress reporting
- Persistent Indexes - Save/reload indexes efficiently
- Smart Chunking - Extracts functions, classes, structs by language
- Ignore Rules - Respects .gitignore and custom patterns
- Extensible - Trait-based design for custom embedders
- Pure Rust - No Python, uses ONNX Runtime
- GPU Support - Optional CUDA, TensorRT, DirectML, CoreML acceleration
Quick Start
Installation
# Clone and build
# Install CLI
CLI Usage
# Index a codebase (uses hash embeddings by default)
# Index with an ONNX model for semantic search
# Search
# Search with options
# List indexed workspaces
# Show statistics
# Delete a workspace
Library Usage
use ;
use PathBuf;
async
Using ONNX Models
Aurora uses ONNX Runtime for local model inference. You need to download models yourself.
Recommended Models
| Model | Dimension | Max Length | Use Case |
|---|---|---|---|
jina-code-embeddings-1.5b |
1536 | 32768 | Best for code (15+ languages, task prefixes) |
jina-embeddings-v2-base-code |
768 | 8192 | Code search (30 languages) |
all-MiniLM-L6-v2 |
384 | 512 | General text |
Jina Code Embeddings 1.5B (Recommended)
The jina-code-embeddings-1.5b model offers:
- Task-specific prefixes for NL2Code, Code2Code, Code2NL, Code2Completion, QA
- Matryoshka dimensions - truncate to 128, 256, 512, 1024, or 1536
- 32K context window - handles large code files
- Last-token pooling - optimized for Qwen2.5-Coder backbone
use ;
// Load the model
let embedder = from_directory?
.with_task
.with_dimension; // 512-dim for faster search
// Index code with PASSAGE prefix
let code_embedding = embedder.embed_passage?;
// Search with QUERY prefix (asymmetric retrieval)
let query_embedding = embedder.embed_query?;
Task Types
| Task | Query Use Case | Passage Use Case |
|---|---|---|
NL2Code |
Natural language query | Code snippets |
Code2Code |
Code snippet | Similar code |
Code2NL |
Code snippet | Comments/docs |
Code2Completion |
Partial code | Completions |
QA |
Tech question | Answers |
Model Directory Structure
models/jina-code-1.5b/
├── model.onnx # ONNX model file
├── tokenizer.json # HuggingFace tokenizer
└── config.json # Optional: model config
Downloading Models
Download Jina Code 1.5B:
# Create directory
# Download from HuggingFace (export with optimum first)
Loading Models in Code
use ;
// Generic ONNX model
let embedder = from_directory?;
// Jina Code 1.5B with full configuration
let embedder = from_directory
.with_task
.with_dimension
.with_max_length // Limit context if needed
.load?;
Project Structure
aurora-semantic/
├── src/
│ ├── lib.rs # Public API exports
│ ├── types.rs # Core types (Document, Chunk, SearchResult)
│ ├── config.rs # Configuration types
│ ├── error.rs # Error types
│ ├── bin/
│ │ └── aurora.rs # CLI binary
│ ├── chunker/
│ │ ├── mod.rs # Code chunking trait + default
│ │ └── strategies.rs # Chunking strategies
│ ├── embeddings/
│ │ ├── mod.rs # Embedder trait
│ │ ├── providers.rs # ONNX + Hash embedders
│ │ └── pooling.rs # Pooling strategies
│ ├── search/
│ │ ├── mod.rs # Search coordination
│ │ ├── lexical.rs # Tantivy-based search
│ │ ├── semantic.rs # Vector similarity search
│ │ └── query.rs # Query types + filters
│ ├── storage/
│ │ ├── mod.rs # Storage trait
│ │ ├── disk.rs # Disk persistence
│ │ └── metadata.rs # Workspace metadata
│ ├── ignore/
│ │ └── mod.rs # File filtering
│ └── engine/
│ └── mod.rs # Main Engine API
├── Cargo.toml
└── README.md
Configuration
EngineConfig
let config = new
.with_chunking
.with_search
.with_ignore;
Note: The
.auroraindex directory is automatically excluded by default to prevent self-indexing.
Search Modes
- Lexical - Keyword-based search using Tantivy (fast, exact matches)
- Semantic - Embedding similarity search (understands meaning)
- Hybrid - Combines both with configurable weights (default)
use ;
let query = new
.mode
.limit
.min_score
.filter;
Supported Languages
Aurora extracts semantic chunks (functions, classes, etc.) from:
- Rust
- Python
- JavaScript / TypeScript
- Go
- Java
- C / C++
Other languages use generic line-based chunking.
Environment Variables
| Variable | Description | Default |
|---|---|---|
AURORA_MODEL_PATH |
Path to ONNX model directory | None |
RUST_LOG |
Logging level | info |
Developer Integration
Implementing Custom Embedder
use ;
// Use it
let engine = with_embedder?;
Progress Reporting
let progress_callback = Boxnew;
engine.index_workspace.await?;
GPU Acceleration
Aurora supports GPU acceleration via ONNX Runtime execution providers:
| Feature | Platform | Requirements |
|---|---|---|
cuda |
NVIDIA GPUs | CUDA 11.x/12.x toolkit |
tensorrt |
NVIDIA GPUs | TensorRT 8.x |
directml |
Windows (AMD/Intel/NVIDIA) | DirectX 12 |
coreml |
macOS (Apple Silicon) | macOS 11+ |
Build with GPU support:
# NVIDIA CUDA
# Windows DirectML (works with any GPU)
# macOS Apple Silicon
Detecting GPU at Runtime
use ;
let embedder = from_directory?;
let provider_info = embedder.execution_provider_info;
println!;
// Output: "Using: CUDA (gpu)" or "Using: CPU (cpu)"
The embedder automatically uses GPU when available, falling back to CPU otherwise.
Performance
- Indexing: ~1000 files/second (without embeddings), ~100 files/second (with ONNX on CPU)
- Indexing with GPU: ~500+ files/second (with ONNX on GPU)
- Search: <10ms for lexical, <50ms for semantic
- Memory: ~100MB base + ~1KB per indexed chunk
- Disk: ~2x source size for full index
License
MIT
Contributing
Contributions welcome! Please read CONTRIBUTING.md first.