Aurora Semantic

A local, embedded semantic search engine for source code, designed to be bundled directly inside desktop IDEs.

Pure Rust implementation using ONNX Runtime for embedding inference - no Python required.

Features

🔍 Hybrid Search - Combines lexical (keyword) and semantic (AI) search
🚀 Fast Indexing - Parallel file processing with progress reporting
💾 Persistent Indexes - Save/reload indexes efficiently
🎯 Smart Chunking - Extracts functions, classes, structs by language
📁 Ignore Rules - Respects .gitignore and custom patterns
🔌 Extensible - Trait-based design for custom embedders
🦀 Pure Rust - No Python, uses ONNX Runtime

Quick Start

Installation

# Clone and build
git clone https://github.com/void-editor/aurora-semantic
cd aurora-semantic
cargo build --release

# Install CLI
cargo install --path .

CLI Usage

# Index a codebase (uses hash embeddings by default)
aurora index ./my-project

# Index with an ONNX model for semantic search
aurora --model ./models/jina-code index ./my-project

# Search
aurora search "authentication middleware"

# Search with options
aurora search "error handling" --limit 20 --mode semantic

# List indexed workspaces
aurora list

# Show statistics
aurora stats

# Delete a workspace
aurora delete <workspace-id>

Library Usage

use aurora_semantic::{Engine, EngineConfig, WorkspaceConfig, SearchQuery, ModelConfig};
use std::path::PathBuf;

#[tokio::main]
async fn main() -> aurora_semantic::Result<()> {
    // Option 1: Use hash embeddings (fast, no model needed)
    let config = EngineConfig::new(PathBuf::from(".aurora"));
    let engine = Engine::new(config)?;

    // Option 2: Use ONNX model for semantic search
    let embedder = ModelConfig::from_directory("./models/jina-code")
        .with_max_length(8192)
        .load()?;
    let engine = Engine::with_embedder(config, embedder)?;

    // Index a workspace
    let ws_config = WorkspaceConfig::new(PathBuf::from("./my-project"));
    let workspace_id = engine.index_workspace(ws_config, None).await?;

    // Search
    let results = engine.search_text(&workspace_id, "database connection")?;

    for result in results {
        println!("{}: {} (score: {:.2})",
            result.document.relative_path.display(),
            result.chunk.symbol_name.as_deref().unwrap_or("unknown"),
            result.score
        );
    }

    Ok(())
}

Using ONNX Models

Aurora uses ONNX Runtime for local model inference. You need to download models yourself.

Recommended Models

Model	Dimension	Use Case	Download
`jina-embeddings-v2-base-code`	768	Code search (30 languages)	HuggingFace
`all-MiniLM-L6-v2`	384	General text	HuggingFace
`bge-small-en-v1.5`	384	English text	HuggingFace

Model Directory Structure

models/jina-code/
├── model.onnx          # ONNX model file
├── tokenizer.json      # HuggingFace tokenizer
└── config.json         # Optional: model config

Downloading Models

Option 1: Download pre-exported ONNX models directly

# Jina Code model (recommended for code search)
mkdir -p models/jina-code
curl -L -o models/jina-code/model.onnx \
    "https://huggingface.co/jinaai/jina-embeddings-v2-base-code/resolve/main/onnx/model.onnx"
curl -L -o models/jina-code/tokenizer.json \
    "https://huggingface.co/jinaai/jina-embeddings-v2-base-code/resolve/main/tokenizer.json"

Option 2: Export with optimum

pip install optimum[exporters]

# Export Jina Code model
optimum-cli export onnx --model jinaai/jina-embeddings-v2-base-code ./models/jina-code

# Export sentence-transformers model
optimum-cli export onnx --model sentence-transformers/all-MiniLM-L6-v2 ./models/minilm

Loading Models in Code

use aurora_semantic::{OnnxEmbedder, ModelConfig};

// Simple: load from directory
let embedder = OnnxEmbedder::from_directory("./models/jina-code")?;

// Advanced: custom settings
let embedder = ModelConfig::from_directory("./models/jina-code")
    .with_max_length(8192)  // Jina supports up to 8192 tokens
    .with_dimension(768)
    .load()?;

Project Structure

aurora-semantic/
├── src/
│   ├── lib.rs              # Public API exports
│   ├── types.rs            # Core types (Document, Chunk, SearchResult)
│   ├── config.rs           # Configuration types
│   ├── error.rs            # Error types
│   ├── bin/
│   │   └── aurora.rs       # CLI binary
│   ├── chunker/
│   │   ├── mod.rs          # Code chunking trait + default
│   │   └── strategies.rs   # Chunking strategies
│   ├── embeddings/
│   │   ├── mod.rs          # Embedder trait
│   │   ├── providers.rs    # ONNX + Hash embedders
│   │   └── pooling.rs      # Pooling strategies
│   ├── search/
│   │   ├── mod.rs          # Search coordination
│   │   ├── lexical.rs      # Tantivy-based search
│   │   ├── semantic.rs     # Vector similarity search
│   │   └── query.rs        # Query types + filters
│   ├── storage/
│   │   ├── mod.rs          # Storage trait
│   │   ├── disk.rs         # Disk persistence
│   │   └── metadata.rs     # Workspace metadata
│   ├── ignore/
│   │   └── mod.rs          # File filtering
│   └── engine/
│       └── mod.rs          # Main Engine API
├── Cargo.toml
└── README.md

Configuration

EngineConfig

let config = EngineConfig::new(PathBuf::from(".aurora"))
    .with_chunking(ChunkingConfig {
        max_chunk_size: 2000,
        min_chunk_size: 50,
        ..Default::default()
    })
    .with_search(SearchConfig {
        default_mode: SearchMode::Hybrid,
        lexical_weight: 0.4,
        semantic_weight: 0.6,
        min_score: 0.1,
        ..Default::default()
    })
    .with_ignore(IgnoreConfig {
        use_gitignore: true,
        ignored_directories: vec!["node_modules".into(), "target".into()],
        ..Default::default()
    });

Search Modes

Lexical - Keyword-based search using Tantivy (fast, exact matches)
Semantic - Embedding similarity search (understands meaning)
Hybrid - Combines both with configurable weights (default)

use aurora_semantic::{SearchQuery, SearchMode, SearchFilter};

let query = SearchQuery::new("authentication")
    .mode(SearchMode::Hybrid)
    .limit(20)
    .min_score(0.3)
    .filter(SearchFilter::new()
        .languages(vec![Language::Rust, Language::TypeScript])
        .chunk_types(vec![ChunkType::Function]));

Supported Languages

Aurora extracts semantic chunks (functions, classes, etc.) from:

Rust
Python
JavaScript / TypeScript
Go
Java
C / C++

Other languages use generic line-based chunking.

Environment Variables

Variable	Description	Default
`AURORA_MODEL_PATH`	Path to ONNX model directory	None
`RUST_LOG`	Logging level	`info`

Developer Integration

Implementing Custom Embedder

use aurora_semantic::{Embedder, Result};

struct MyEmbedder {
    dimension: usize,
}

impl Embedder for MyEmbedder {
    fn embed(&self, text: &str) -> Result<Vec<f32>> {
        // Your embedding logic
        Ok(vec![0.0; self.dimension])
    }

    fn embed_batch(&self, texts: &[&str]) -> Result<Vec<Vec<f32>>> {
        texts.iter().map(|t| self.embed(t)).collect()
    }

    fn dimension(&self) -> usize {
        self.dimension
    }

    fn name(&self) -> &'static str {
        "my-embedder"
    }
}

// Use it
let engine = Engine::with_embedder(config, MyEmbedder { dimension: 768 })?;

Progress Reporting

let progress_callback = Box::new(|progress: IndexProgress| {
    match progress.phase {
        IndexPhase::Scanning => println!("Scanning files..."),
        IndexPhase::Parsing => println!("Parsing {}/{}", progress.processed, progress.total),
        IndexPhase::Embedding => println!("Generating embeddings..."),
        IndexPhase::Complete => println!("Done!"),
        _ => {}
    }
});

engine.index_workspace(ws_config, Some(progress_callback)).await?;

GPU Acceleration

Aurora supports GPU acceleration via ONNX Runtime execution providers:

Feature	Platform	Requirements
`cuda`	NVIDIA GPUs	CUDA 11.x/12.x toolkit
`tensorrt`	NVIDIA GPUs	TensorRT 8.x
`directml`	Windows (AMD/Intel/NVIDIA)	DirectX 12
`coreml`	macOS (Apple Silicon)	macOS 11+

Build with GPU support:

# NVIDIA CUDA
cargo build --release --features cuda

# Windows DirectML (works with any GPU)
cargo build --release --features directml

# macOS Apple Silicon
cargo build --release --features coreml

The embedder automatically uses GPU when available, falling back to CPU otherwise.

Performance

Indexing: ~1000 files/second (without embeddings), ~100 files/second (with ONNX on CPU)
Indexing with GPU: ~500+ files/second (with ONNX on GPU)
Search: <10ms for lexical, <50ms for semantic
Memory: ~100MB base + ~1KB per indexed chunk
Disk: ~2x source size for full index

License

MIT

Contributing

Contributions welcome! Please read CONTRIBUTING.md first.

aurora-semantic 0.1.0