Crate oxify_vector

Crate oxify_vector 

Source
Expand description

§OxiFY Vector

Ported from OxiRS (https://github.com/cool-japan/oxirs) Original implementation: Copyright (c) OxiRS Contributors Adapted for OxiFY (simplified for LLM workflow focus) License: MIT OR Apache-2.0 (compatible with OxiRS)

In-memory vector search and similarity operations for RAG and embeddings.

This crate provides:

  • Vector similarity search (cosine, euclidean, dot product, manhattan)
  • Brute-force exact search for small to medium datasets
  • Batch search for multiple queries
  • Radius search for finding neighbors within distance

§Example

use oxify_vector::{VectorSearchIndex, SearchConfig, DistanceMetric};
use std::collections::HashMap;

// Create embeddings
let mut embeddings = HashMap::new();
embeddings.insert("doc1".to_string(), vec![0.1, 0.2, 0.3]);
embeddings.insert("doc2".to_string(), vec![0.2, 0.3, 0.4]);
embeddings.insert("doc3".to_string(), vec![0.3, 0.4, 0.5]);

// Build search index
let config = SearchConfig::default();
let mut index = VectorSearchIndex::new(config);
index.build(&embeddings)?;

// Search for similar documents
let query = vec![0.15, 0.25, 0.35];
let results = index.search(&query, 2)?;

for result in results {
    println!("{}: score = {:.4}", result.entity_id, result.score);
}

Re-exports§

pub use adaptive::AdaptiveConfig;
pub use adaptive::AdaptiveIndex;
pub use adaptive::AdaptiveStats;
pub use cache::CacheConfig;
pub use cache::CacheStats;
pub use cache::QueryCache;
pub use colbert::ColbertConfig;
pub use colbert::ColbertIndex;
pub use colbert::ColbertSearchResult;
pub use colbert::ColbertStats;
pub use colbert::MultiVectorDoc;
pub use distributed::ConsistentHash;
pub use distributed::DistributedIndex;
pub use distributed::DistributedStats;
pub use distributed::ShardConfig;
pub use embeddings::CachedEmbeddingProvider;
pub use embeddings::EmbeddingCache;
pub use embeddings::EmbeddingProvider;
pub use embeddings::MockEmbeddingProvider;
pub use embeddings::OpenAIConfig;
pub use embeddings::OpenAIEmbeddingProvider;
pub use filter::Filter;
pub use filter::FilterCondition;
pub use filter::FilterValue;
pub use filter::Metadata;
pub use gpu::GpuBatchProcessor;
pub use gpu::GpuConfig;
pub use gpu::GpuStats;
pub use hnsw::HnswConfig;
pub use hnsw::HnswIndex;
pub use hnsw::HnswStats;
pub use hybrid::HybridConfig;
pub use hybrid::HybridIndex;
pub use hybrid::HybridSearchResult;
pub use hybrid::HybridStats;
pub use ivf::IvfPqConfig;
pub use ivf::IvfPqIndex;
pub use ivf::IvfPqStats;
pub use lsh::LshConfig;
pub use lsh::LshIndex;
pub use lsh::LshStats;
pub use metrics::IndexStats as MetricsIndexStats;
pub use metrics::LatencyTimer;
pub use metrics::Metrics;
pub use metrics::SearchStats;
pub use multi_index::MultiIndexConfig;
pub use multi_index::MultiIndexSearch;
pub use multi_index::ScoreMergeStrategy;
pub use optimizer::OptimizerConfig;
pub use optimizer::QueryOptimizer;
pub use optimizer::QueryPlan;
pub use optimizer::SearchStrategy;
pub use otel::init_tracing;
pub use otel::shutdown_tracing;
pub use otel::TracingConfig;
pub use profiling::Bottleneck;
pub use profiling::ImpactLevel;
pub use profiling::IndexHealthChecker;
pub use profiling::ProfilingConfig;
pub use profiling::QueryProfile;
pub use profiling::QueryProfiler;
pub use profiling::Recommendation;
pub use quantization::BinaryQuantizationConfig;
pub use quantization::BinaryQuantizedIndex;
pub use quantization::BinaryQuantizedIndexStats;
pub use quantization::BinaryQuantizer;
pub use quantization::FourBitQuantizedIndex;
pub use quantization::FourBitQuantizedIndexStats;
pub use quantization::FourBitQuantizer;
pub use quantization::QuantizationConfig;
pub use quantization::QuantizedIndexStats;
pub use quantization::QuantizedVectorIndex;
pub use quantization::ScalarQuantizer;
pub use recall_eval::AggregatedMetrics;
pub use recall_eval::EvaluationConfig;
pub use recall_eval::QueryMetrics;
pub use recall_eval::RecallEvaluator;
pub use search::VectorSearchIndex;
pub use types::DistanceMetric;
pub use types::IndexStats;
pub use types::SearchConfig;
pub use types::SearchResult;

Modules§

adaptive
Adaptive Index - Automatic Performance Optimization
cache
Query result caching for improved search performance.
colbert
ColBERT-style Multi-Vector Search
distributed
Distributed vector search with sharding and replication.
embeddings
Embedding Management
filter
Metadata filtering for vector search
gpu
GPU-accelerated vector operations using CUDA.
hnsw
HNSW (Hierarchical Navigable Small World) Index
hybrid
Hybrid Search: Vector + Keyword (BM25)
ivf
IVF-PQ (Inverted File Index with Product Quantization)
lsh
Locality Sensitive Hashing (LSH) for Approximate Nearest Neighbor Search
metrics
Observability and Metrics
multi_index
Multi-Index Search
optimizer
Query Optimizer
otel
OpenTelemetry Tracing Integration
persistence
Index Persistence
profiling
Query Profiling and Analysis
quantization
Vector Quantization for Memory Optimization
recall_eval
Recall Evaluation for ANN Indexes
search
Vector search implementation
simd
SIMD-accelerated distance calculations
types
Vector search types and configuration