tranz
Knowledge graph embedding models.
[]
= "0.5"
Dual-licensed under MIT or Apache-2.0.
For context on how point embeddings relate to region-based approaches, see Why Regions, Not Points.
Models
Each model scores a triple (head, relation, tail) differently:
| Model | Scoring function | Intuition | Reference |
|---|---|---|---|
| TransE | $\lVert \mathbf{h} + \mathbf{r} - \mathbf{t} \rVert$ | Translation: tail = head + relation | Bordes et al., 2013 |
| RotatE | $\lVert \mathbf{h} \circ \mathbf{r} - \mathbf{t} \rVert$ | Rotation in complex plane | Sun et al., 2019 |
| ComplEx | $\text{Re}(\langle \mathbf{h}, \mathbf{r}, \bar{\mathbf{t}} \rangle)$ | Asymmetric via complex conjugate | Trouillon et al., 2016 |
| DistMult | $\langle \mathbf{h}, \mathbf{r}, \mathbf{t} \rangle$ | Element-wise product, symmetric | Yang et al., 2015 |
$\mathbf{h}, \mathbf{r}, \mathbf{t}$ are learned embedding vectors for head, relation, and tail. $\lVert \cdot \rVert$ is the L2 norm, $\circ$ is element-wise product, $\langle \cdot \rangle$ is the trilinear dot product, $\bar{\mathbf{t}}$ is the complex conjugate.
Quick start
Install with cargo install tranz --features candle.
# Train with 1-N scoring (recommended)
# Train with negative sampling (classic)
# Predict from saved embeddings
Benchmark: WN18RR
| Model | Config | Dim | Epochs | MRR | H@1 | H@10 |
|---|---|---|---|---|---|---|
| ComplEx | Adagrad + N3 + reciprocals | 100 | 100 | 0.438 | 0.400 | 0.512 |
| ComplEx | Adam + reciprocals | 100 | 50 | 0.429 | 0.407 | 0.469 |
| DistMult | Adam + 1-N | 100 | 50 | 0.341 | 0.329 | 0.362 |
Published ComplEx MRR on WN18RR is 0.475 (Lacroix et al. 2018). tranz reaches 92% of published with the same recipe (Adagrad, N3, reciprocals).
Library usage
use ;
use ;
use evaluate_link_prediction;
// Load dataset (types from lattix::kge)
let ds = load_dataset.unwrap;
let mut interned = ds.into_interned;
interned.add_reciprocals;
// Create model and query
let model = new;
let top10 = model.top_k_tails;
// Evaluate (filtered link prediction)
let filter = from_dataset;
let metrics = evaluate_link_prediction;
Generic triple loading
use ;
let ds = load_flexible.unwrap;
let ds = ds.split; // 80/10/10
let interned = ds.into_interned;
Embedding export
use ;
// Export to w2v TSV
export_embeddings.unwrap;
// Flat f32 matrix for FAISS/Qdrant
let flat: = flatten_matrix;
Multi-hop query answering
Answers conjunctive, disjunctive, and negation queries by decomposing them into atomic link prediction calls composed with t-norm fuzzy logic (CQD-Beam, Arakelyan et al. 2021). No complex-query training needed.
use ;
use DistMult;
let model = new;
// 2-hop chain: entity 0 -rel 0-> V -rel 1-> ?
let q = anchor.then;
// Intersection: (0 -r0-> ?) AND (1 -r1-> ?)
let q = intersection;
// Intersect-then-project (pi): (0 -r0-> V AND 1 -r1-> V) -r2-> ?
let q = intersection.then;
let top10 = answer_query_topk;
Ensemble scoring
Average scores from multiple models (snapshots, different seeds).
use ;
let models: = vec!;
let ensemble = new;
let top5 = ensemble.top_k_tails;
Training (requires candle feature)
1-N scoring (all entities per query via matmul + softmax CE) is recommended. Negative sampling with SANS weighting is also supported.
use ;
let config = TrainConfig ;
let result = train.unwrap;
Companion to subsume
subsume embeds entities as geometric regions (boxes, cones) where containment encodes subsumption. tranz embeds entities as points where distance/similarity encodes relational facts.
- subsume: ontology completion, taxonomy expansion, logical query answering
- tranz: link prediction, relation extraction, knowledge base completion