OxiRS Embed - Knowledge Graph Embeddings
Status: Production Release (v0.1.0) - Released January 2026
✨ Production Release: Production-ready with API stability guarantees. Semantic versioning enforced.
Generate vector embeddings for RDF knowledge graphs enabling semantic similarity search, entity linking, and neural-symbolic AI integration.
Features
Embedding Models
- TransE - Translational distance models for knowledge graphs
- DistMult - Bilinear diagonal models for symmetric relations
- ComplEx - Complex-valued embeddings for asymmetric relations
- RotatE - Rotation-based models in complex space
- HolE - Holographic embeddings using circular correlation (NEW in v0.1.0)
- ConvE - Convolutional 2D neural network embeddings (NEW in v0.1.0)
- TuckER - Tucker decomposition for multi-relational learning
- QuatE - Quaternion embeddings for complex patterns
Advanced Features (NEW in v0.1.0)
- Link Prediction - Predict missing triples (head/tail/relation)
- Filtered ranking to remove known triples
- Batch prediction for efficiency
- Evaluation metrics (MRR, Hits@K, Mean Rank)
- Entity Clustering - Group similar entities
- K-Means with K-Means++ initialization
- Hierarchical (agglomerative) clustering
- DBSCAN (density-based) clustering
- Spectral clustering
- Quality metrics (silhouette score, inertia)
- Community Detection - Find communities in knowledge graphs
- Louvain modularity optimization
- Label propagation
- Girvan-Newman edge betweenness
- Embedding-based detection
- Vector Search - High-performance semantic search (NEW in 0.1.0)
- Exact search with multiple distance metrics
- Cosine similarity, Euclidean, dot product, Manhattan
- Batch search for multiple queries
- Radius-based filtering
- Parallel processing support
- Visualization - t-SNE, PCA, UMAP, Random Projection
- 2D and 3D dimensionality reduction
- Export to CSV/JSON formats
- Cluster-aware visualizations
- Interpretability - Model understanding tools
- Similarity analysis and nearest neighbors
- Feature importance analysis
- Counterfactual explanations
- Embedding space diagnostics
- Mixed Precision Training - FP16/FP32 for faster training
- Model Quantization - Int8/Int4/Binary compression (3-4x size reduction)
Knowledge Graph Embedding
- Entity Embeddings - Generate embeddings for RDF entities
- Relation Embeddings - Embed predicates and relationships
- Graph Embeddings - Whole-graph vector representations
- Contextual Embeddings - Use graph context for better embeddings
Applications
- Semantic Search - Find similar entities by meaning
- Entity Linking - Link mentions to knowledge graph entities
- Relation Prediction - Predict missing relationships
- Clustering - Group similar entities
- Knowledge Graph Completion - Fill missing facts in KGs
- Anomaly Detection - Detect unusual patterns in graphs
Installation
Add to your Cargo.toml:
# Experimental feature
[]
= "0.1.0"
# Enable specific providers
= { = "0.1.0", = ["openai", "sentence-transformers"] }
Quick Start
Basic Entity Embedding
use ;
async
Knowledge Graph Embedding
use KnowledgeGraphEmbedder;
use Dataset;
async
Semantic Similarity
use ;
// Find similar entities
let query_embedding = model.encode.await?;
let similar_entities = find_similar?;
for in similar_entities
New Models (v0.1.0)
HolE (Holographic Embeddings)
HolE uses circular correlation to model entity and relation interactions. Effective for capturing symmetric and asymmetric patterns.
use ;
async
ConvE (Convolutional Embeddings)
ConvE uses 2D CNNs for expressive knowledge graph embeddings. Parameter-efficient with shared convolutional filters.
use ;
let config = ConvEConfig ;
let mut model = new;
// Add triples and train as before
model.add_triple?;
model.train.await?;
Link Prediction
Predict missing entities or relations in knowledge graphs.
use ;
// Create predictor
let pred_config = LinkPredictionConfig ;
let predictor = new;
// Predict tail entity (object prediction)
let candidates = vec!;
let predictions = predictor.predict_tail?;
for pred in predictions
// Predict head entity (subject prediction)
let predictions = predictor.predict_head?;
// Predict relation
let relations = vec!;
let predictions = predictor.predict_relation?;
Entity Clustering
Group similar entities based on learned embeddings.
use ;
use HashMap;
// Extract embeddings
let mut embeddings = new;
for entity in model.get_entities
// K-Means clustering
let config = ClusteringConfig ;
let mut clustering = new;
let result = clustering.cluster?;
println!;
println!;
for in result.assignments
Community Detection
Find communities in knowledge graphs using graph structure and embeddings.
use ;
let config = CommunityConfig ;
let mut detector = new;
let result = detector.detect?;
println!;
println!;
Vector Search
High-performance semantic search for knowledge graph embeddings.
use ;
// Build search index
let config = SearchConfig ;
let mut index = new;
index.build?;
// Search for similar entities
let query_embedding = embeddings.to_vec;
let results = index.search?;
for result in results
// Batch search
let queries = vec!;
let batch_results = index.batch_search?;
// Radius search (find all within distance)
let radius_results = index.radius_search?;
Visualization
Visualize embeddings in 2D/3D using dimensionality reduction.
use ;
// PCA visualization
let config = VisualizationConfig ;
let mut visualizer = new;
let result = visualizer.visualize?;
// t-SNE visualization (better for discovering clusters)
let tsne_config = VisualizationConfig ;
let mut tsne_viz = new;
let tsne_result = tsne_viz.visualize?;
// Export to CSV for plotting
for in &tsne_result.coordinates
Interpretability
Understand why models make certain predictions.
use ;
// Similarity analysis
let config = InterpretabilityConfig ;
let analyzer = new;
let analysis = analyzer.similarity_analysis?;
println!;
for in &analysis.similar_entities
// Feature importance
let importance_config = InterpretabilityConfig ;
let imp_analyzer = new;
let importance = imp_analyzer.feature_importance?;
// Counterfactual explanations
let counterfactual = analyzer.counterfactual_explanation?;
println!;
Supported Embedding Providers
Sentence Transformers (Local)
use ;
let model = builder
.provider
.model_name // or other models
.device // Optional GPU support
.build
.await?;
Popular models:
all-mpnet-base-v2- General purpose, 768 dimensionsall-MiniLM-L6-v2- Faster, 384 dimensionsmulti-qa-mpnet-base-dot-v1- For Q&A tasksparaphrase-multilingual-mpnet-base-v2- Multi-lingual
OpenAI Embeddings (API)
use ;
let model = builder
.provider
.api_key
.model_name
.build
.await?;
let embedding = model.encode.await?;
Available models:
text-embedding-3-small- 1536 dimensions, cost-effectivetext-embedding-3-large- 3072 dimensions, highest qualitytext-embedding-ada-002- Legacy model
Custom Models
use ;
let config = CustomModelConfig ;
let model = custom.await?;
Advanced Features
Batch Processing
let texts = vec!;
// Process in batches for efficiency
let embeddings = model.encode_batch.await?;
Contextual Embeddings
Use graph context for better embeddings:
use ContextualEmbedder;
let embedder = builder
.model
.context_depth // Include 2-hop neighbors
.weight_by_relation // Different weights for different relations
.build
.await?;
let embedding = embedder.embed_entity_with_context.await?;
Entity Linking
use EntityLinker;
let linker = new;
// Link text mention to knowledge graph entity
let mention = "machine learning expert from Stanford";
let candidates = linker.link.await?;
for in candidates
Relation Prediction
use RelationPredictor;
let predictor = new;
// Predict relation between entities
let predictions = predictor.predict_relation?;
for in predictions
Integration with OxiRS
With oxirs-vec (Vector Search)
use EmbeddingModel;
use VectorStore;
// Generate embeddings
let model = load.await?;
let embeddings = model.encode_dataset.await?;
// Index in vector store
let mut store = new?;
for in embeddings
store.build_index?;
With oxirs-chat (RAG)
use EmbeddingModel;
use RagSystem;
let model = load.await?;
let rag = builder
.embedding_model
.knowledge_graph
.build?;
Performance
Benchmark Results
| Model | Embedding Time | Dimension | Quality (Avg) |
|---|---|---|---|
| all-mpnet-base-v2 | 15ms | 768 | 0.85 |
| all-MiniLM-L6-v2 | 5ms | 384 | 0.78 |
| text-embedding-3-small | 50ms* | 1536 | 0.88 |
*API call latency
Optimization Tips
// Use batch processing
let embeddings = model.encode_batch.await?;
// Cache embeddings
let cache = new?;
let embedding = cache.get_or_compute?;
// GPU acceleration (if available)
let model = builder
.device
.build
.await?;
Status
Production Release (v0.1.0)
- ✅ Sentence Transformers integration with batch streaming + persistence
- ✅ OpenAI embeddings support with provider failover and caching
- ✅ Entity/graph embeddings wired into CLI ingest/export pipelines
- ✅ Semantic similarity search via
oxirs-vec+ SPARQL federation hooks - 🚧 Contextual embeddings (expanded graph context) – in progress
- 🚧 Relation prediction (knowledge completion) – in progress
- ⏳ Fine-tuning support (planned for v0.2.0)
Contributing
This is an experimental module. Feedback welcome!
License
MIT OR Apache-2.0
See Also
- oxirs-vec - Vector search engine
- oxirs-chat - AI-powered chat with RAG
- oxirs-core - RDF data model