oxirs-embed-0.1.0-alpha.3 has been yanked.
OxiRS Embed - Knowledge Graph Embeddings
Status: Alpha Release (v0.1.0-alpha.3) - Released October 12, 2025
⚠️ Alpha Software: This is an early alpha release. Experimental features. APIs may change without notice. Not recommended for production use.
Generate vector embeddings for RDF knowledge graphs enabling semantic similarity search, entity linking, and neural-symbolic AI integration.
Features
Embedding Models
- Sentence Transformers - Pre-trained models from HuggingFace
- OpenAI Embeddings - GPT-based embeddings via API
- Custom Models - Bring your own embedding models
- Multi-lingual Support - Models for various languages
Knowledge Graph Embedding
- Entity Embeddings - Generate embeddings for RDF entities
- Relation Embeddings - Embed predicates and relationships
- Graph Embeddings - Whole-graph vector representations
- Contextual Embeddings - Use graph context for better embeddings
Applications
- Semantic Search - Find similar entities by meaning
- Entity Linking - Link mentions to knowledge graph entities
- Relation Prediction - Predict missing relationships
- Clustering - Group similar entities
Installation
Add to your Cargo.toml:
# Experimental feature
[]
= "0.1.0-alpha.3"
# Enable specific providers
= { = "0.1.0-alpha.3", = ["openai", "sentence-transformers"] }
Quick Start
Basic Entity Embedding
use ;
async
Knowledge Graph Embedding
use KnowledgeGraphEmbedder;
use Dataset;
async
Semantic Similarity
use ;
// Find similar entities
let query_embedding = model.encode.await?;
let similar_entities = find_similar?;
for in similar_entities
Supported Embedding Providers
Sentence Transformers (Local)
use ;
let model = builder
.provider
.model_name // or other models
.device // Optional GPU support
.build
.await?;
Popular models:
all-mpnet-base-v2- General purpose, 768 dimensionsall-MiniLM-L6-v2- Faster, 384 dimensionsmulti-qa-mpnet-base-dot-v1- For Q&A tasksparaphrase-multilingual-mpnet-base-v2- Multi-lingual
OpenAI Embeddings (API)
use ;
let model = builder
.provider
.api_key
.model_name
.build
.await?;
let embedding = model.encode.await?;
Available models:
text-embedding-3-small- 1536 dimensions, cost-effectivetext-embedding-3-large- 3072 dimensions, highest qualitytext-embedding-ada-002- Legacy model
Custom Models
use ;
let config = CustomModelConfig ;
let model = custom.await?;
Advanced Features
Batch Processing
let texts = vec!;
// Process in batches for efficiency
let embeddings = model.encode_batch.await?;
Contextual Embeddings
Use graph context for better embeddings:
use ContextualEmbedder;
let embedder = builder
.model
.context_depth // Include 2-hop neighbors
.weight_by_relation // Different weights for different relations
.build
.await?;
let embedding = embedder.embed_entity_with_context.await?;
Entity Linking
use EntityLinker;
let linker = new;
// Link text mention to knowledge graph entity
let mention = "machine learning expert from Stanford";
let candidates = linker.link.await?;
for in candidates
Relation Prediction
use RelationPredictor;
let predictor = new;
// Predict relation between entities
let predictions = predictor.predict_relation?;
for in predictions
Integration with OxiRS
With oxirs-vec (Vector Search)
use EmbeddingModel;
use VectorStore;
// Generate embeddings
let model = load.await?;
let embeddings = model.encode_dataset.await?;
// Index in vector store
let mut store = new?;
for in embeddings
store.build_index?;
With oxirs-chat (RAG)
use EmbeddingModel;
use RagSystem;
let model = load.await?;
let rag = builder
.embedding_model
.knowledge_graph
.build?;
Performance
Benchmark Results
| Model | Embedding Time | Dimension | Quality (Avg) |
|---|---|---|---|
| all-mpnet-base-v2 | 15ms | 768 | 0.85 |
| all-MiniLM-L6-v2 | 5ms | 384 | 0.78 |
| text-embedding-3-small | 50ms* | 1536 | 0.88 |
*API call latency
Optimization Tips
// Use batch processing
let embeddings = model.encode_batch.await?;
// Cache embeddings
let cache = new?;
let embedding = cache.get_or_compute?;
// GPU acceleration (if available)
let model = builder
.device
.build
.await?;
Status
Alpha Release (v0.1.0-alpha.3)
- ✅ Sentence Transformers integration with batch streaming + persistence
- ✅ OpenAI embeddings support with provider failover and caching
- ✅ Entity/graph embeddings wired into CLI ingest/export pipelines
- ✅ Semantic similarity search via
oxirs-vec+ SPARQL federation hooks - 🚧 Contextual embeddings (expanded graph context) – in progress
- 🚧 Relation prediction (knowledge completion) – in progress
- ⏳ Fine-tuning support (planned for v0.2.0)
Contributing
This is an experimental module. Feedback welcome!
License
MIT OR Apache-2.0
See Also
- oxirs-vec - Vector search engine
- oxirs-chat - AI-powered chat with RAG
- oxirs-core - RDF data model