# Aurora Semantic
A **local, embedded semantic search engine** for source code, designed to be bundled directly inside desktop IDEs.
Pure Rust implementation using ONNX Runtime for embedding inference - no Python required.
## Features
- **Hybrid Search** - Combines lexical (keyword) and semantic (AI) search
- **Fast Indexing** - Parallel file processing with progress reporting
- **Persistent Indexes** - Save/reload indexes efficiently
- **Smart Chunking** - Extracts functions, classes, structs by language
- **Ignore Rules** - Respects .gitignore and custom patterns
- **Extensible** - Trait-based design for custom embedders
- **Pure Rust** - No Python, uses ONNX Runtime
- **GPU Support** - Optional CUDA, TensorRT, DirectML, CoreML acceleration
## Quick Start
### Installation
```bash
# Clone and build
git clone https://github.com/aurora-editor/aurora-semantic
cd aurora-semantic
cargo build --release
# Install CLI
cargo install --path .
```
### CLI Usage
```bash
# Index a codebase (uses hash embeddings by default)
aurora index ./my-project
# Index with an ONNX model for semantic search
aurora --model ./models/jina-code index ./my-project
# Search
aurora search "authentication middleware"
# Search with options
aurora search "error handling" --limit 20 --mode semantic
# List indexed workspaces
aurora list
# Show statistics
aurora stats
# Delete a workspace
aurora delete <workspace-id>
```
### Library Usage
```rust
use aurora_semantic::{Engine, EngineConfig, WorkspaceConfig, SearchQuery, ModelConfig};
use std::path::PathBuf;
#[tokio::main]
async fn main() -> aurora_semantic::Result<()> {
// Option 1: Use hash embeddings (fast, no model needed)
let config = EngineConfig::new(PathBuf::from(".aurora"));
let engine = Engine::new(config)?;
// Option 2: Use ONNX model for semantic search
let embedder = ModelConfig::from_directory("./models/jina-code")
.with_max_length(8192)
.load()?;
let engine = Engine::with_embedder(config, embedder)?;
// Index a workspace
let ws_config = WorkspaceConfig::new(PathBuf::from("./my-project"));
let workspace_id = engine.index_workspace(ws_config, None).await?;
// Search
let results = engine.search_text(&workspace_id, "database connection")?;
for result in results {
println!("{}: {} (score: {:.2})",
result.document.relative_path.display(),
result.chunk.symbol_name.as_deref().unwrap_or("unknown"),
result.score
);
}
Ok(())
}
```
## Using ONNX Models
Aurora uses ONNX Runtime for local model inference. You need to download models yourself.
### Recommended Models
| `jina-code-embeddings-1.5b` | 1536 | 32768 | **Best for code** (15+ languages, task prefixes) |
| `jina-embeddings-v2-base-code` | 768 | 8192 | Code search (30 languages) |
| `all-MiniLM-L6-v2` | 384 | 512 | General text |
### Jina Code Embeddings 1.5B (Recommended)
The jina-code-embeddings-1.5b model offers:
- **Task-specific prefixes** for NL2Code, Code2Code, Code2NL, Code2Completion, QA
- **Matryoshka dimensions** - truncate to 128, 256, 512, 1024, or 1536
- **32K context window** - handles large code files
- **Last-token pooling** - optimized for Qwen2.5-Coder backbone
```rust
use aurora_semantic::{JinaCodeEmbedder, EmbeddingTask, MatryoshkaDimension};
// Load the model
let embedder = JinaCodeEmbedder::from_directory("./models/jina-code-1.5b")?
.with_task(EmbeddingTask::NL2Code)
.with_dimension(MatryoshkaDimension::D512); // 512-dim for faster search
// Index code with PASSAGE prefix
let code_embedding = embedder.embed_passage("fn parse_json(s: &str) -> Value { serde_json::from_str(s).unwrap() }")?;
// Search with QUERY prefix (asymmetric retrieval)
let query_embedding = embedder.embed_query("function to parse JSON string")?;
```
#### Task Types
| `NL2Code` | Natural language query | Code snippets |
| `Code2Code` | Code snippet | Similar code |
| `Code2NL` | Code snippet | Comments/docs |
| `Code2Completion` | Partial code | Completions |
| `QA` | Tech question | Answers |
### Model Directory Structure
```
models/jina-code-1.5b/
├── model.onnx # ONNX model file
├── tokenizer.json # HuggingFace tokenizer
└── config.json # Optional: model config
```
### Downloading Models
**Download Jina Code 1.5B:**
```bash
# Create directory
mkdir -p models/jina-code-1.5b
# Download from HuggingFace (export with optimum first)
pip install optimum[exporters]
optimum-cli export onnx --model jinaai/jina-code-embeddings-1.5b ./models/jina-code-1.5b
```
### Loading Models in Code
```rust
use aurora_semantic::{OnnxEmbedder, ModelConfig, JinaCodeConfig, EmbeddingTask};
// Generic ONNX model
let embedder = OnnxEmbedder::from_directory("./models/jina-v2")?;
// Jina Code 1.5B with full configuration
let embedder = JinaCodeConfig::from_directory("./models/jina-code-1.5b")
.with_task(EmbeddingTask::NL2Code)
.with_dimension(MatryoshkaDimension::D1024)
.with_max_length(8192) // Limit context if needed
.load()?;
```
## Project Structure
```
aurora-semantic/
├── src/
│ ├── lib.rs # Public API exports
│ ├── types.rs # Core types (Document, Chunk, SearchResult)
│ ├── config.rs # Configuration types
│ ├── error.rs # Error types
│ ├── bin/
│ │ └── aurora.rs # CLI binary
│ ├── chunker/
│ │ ├── mod.rs # Code chunking trait + default
│ │ └── strategies.rs # Chunking strategies
│ ├── embeddings/
│ │ ├── mod.rs # Embedder trait
│ │ ├── providers.rs # ONNX + Hash embedders
│ │ └── pooling.rs # Pooling strategies
│ ├── search/
│ │ ├── mod.rs # Search coordination
│ │ ├── lexical.rs # Tantivy-based search
│ │ ├── semantic.rs # Vector similarity search
│ │ └── query.rs # Query types + filters
│ ├── storage/
│ │ ├── mod.rs # Storage trait
│ │ ├── disk.rs # Disk persistence
│ │ └── metadata.rs # Workspace metadata
│ ├── ignore/
│ │ └── mod.rs # File filtering
│ └── engine/
│ └── mod.rs # Main Engine API
├── Cargo.toml
└── README.md
```
## Configuration
### EngineConfig
```rust
let config = EngineConfig::new(PathBuf::from(".aurora"))
.with_chunking(ChunkingConfig {
max_chunk_size: 2000,
min_chunk_size: 50,
..Default::default()
})
.with_search(SearchConfig {
default_mode: SearchMode::Hybrid,
lexical_weight: 0.4,
semantic_weight: 0.6,
min_score: 0.1,
..Default::default()
})
.with_ignore(IgnoreConfig::default()
// Exclude specific files by path (relative to workspace root)
.with_excluded_file("src/generated/types.rs")
.with_excluded_files(vec![
"src/proto/generated.rs".into(),
"src/bindings/ffi.rs".into(),
])
// Exclude specific directories by path
.with_excluded_directory("vendor/third-party")
.with_excluded_directories(vec![
"generated".into(),
"node_modules/@types".into(),
])
// Add extra ignored directory names (matched anywhere in path)
.with_ignored_directory("my-custom-vendor")
);
```
> **Note:** The `.aurora` index directory is automatically excluded by default to prevent self-indexing.
### Search Modes
- **Lexical** - Keyword-based search using Tantivy (fast, exact matches)
- **Semantic** - Embedding similarity search (understands meaning)
- **Hybrid** - Combines both with configurable weights (default)
```rust
use aurora_semantic::{SearchQuery, SearchMode, SearchFilter};
let query = SearchQuery::new("authentication")
.mode(SearchMode::Hybrid)
.limit(20)
.min_score(0.3)
.filter(SearchFilter::new()
.languages(vec![Language::Rust, Language::TypeScript])
.chunk_types(vec![ChunkType::Function]));
```
## Supported Languages
Aurora extracts semantic chunks (functions, classes, etc.) from:
- Rust
- Python
- JavaScript / TypeScript
- Go
- Java
- C / C++
Other languages use generic line-based chunking.
## Environment Variables
| `AURORA_MODEL_PATH` | Path to ONNX model directory | None |
| `RUST_LOG` | Logging level | `info` |
## Developer Integration
### Implementing Custom Embedder
```rust
use aurora_semantic::{Embedder, Result};
struct MyEmbedder {
dimension: usize,
}
impl Embedder for MyEmbedder {
fn embed(&self, text: &str) -> Result<Vec<f32>> {
// Your embedding logic
Ok(vec![0.0; self.dimension])
}
fn embed_batch(&self, texts: &[&str]) -> Result<Vec<Vec<f32>>> {
texts.iter().map(|t| self.embed(t)).collect()
}
fn dimension(&self) -> usize {
self.dimension
}
fn name(&self) -> &'static str {
"my-embedder"
}
}
// Use it
let engine = Engine::with_embedder(config, MyEmbedder { dimension: 768 })?;
```
### Progress Reporting
```rust
let progress_callback = Box::new(|progress: IndexProgress| {
match progress.phase {
IndexPhase::Scanning => println!("Scanning files..."),
IndexPhase::Parsing => println!("Parsing {}/{}", progress.processed, progress.total),
IndexPhase::Embedding => println!("Generating embeddings..."),
IndexPhase::Complete => println!("Done!"),
_ => {}
}
});
engine.index_workspace(ws_config, Some(progress_callback)).await?;
```
## GPU Acceleration
Aurora supports GPU acceleration via ONNX Runtime execution providers:
| `cuda` | NVIDIA GPUs | CUDA 11.x/12.x toolkit |
| `tensorrt` | NVIDIA GPUs | TensorRT 8.x |
| `directml` | Windows (AMD/Intel/NVIDIA) | DirectX 12 |
| `coreml` | macOS (Apple Silicon) | macOS 11+ |
Build with GPU support:
```bash
# NVIDIA CUDA
cargo build --release --features cuda
# Windows DirectML (works with any GPU)
cargo build --release --features directml
# macOS Apple Silicon
cargo build --release --features coreml
```
### Detecting GPU at Runtime
```rust
use aurora_semantic::{OnnxEmbedder, ExecutionProviderInfo};
let embedder = OnnxEmbedder::from_directory("./models/jina-code")?;
let provider_info = embedder.execution_provider_info();
println!("Using: {} ({})", provider_info.name, provider_info.provider_type);
// Output: "Using: CUDA (gpu)" or "Using: CPU (cpu)"
```
The embedder automatically uses GPU when available, falling back to CPU otherwise.
## Performance
- **Indexing**: ~1000 files/second (without embeddings), ~100 files/second (with ONNX on CPU)
- **Indexing with GPU**: ~500+ files/second (with ONNX on GPU)
- **Search**: <10ms for lexical, <50ms for semantic
- **Memory**: ~100MB base + ~1KB per indexed chunk
- **Disk**: ~2x source size for full index
## License
MIT
## Contributing
Contributions welcome! Please read CONTRIBUTING.md first.