1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
//! # subsume
//!
//! Geometric region embeddings for subsumption, entailment, and logical query answering.
//!
//! `subsume` provides framework-agnostic traits and concrete backends for
//! geometric embeddings -- boxes, cones, octagons, Gaussians, and hyperbolic
//! intervals -- that encode hierarchical relationships through geometric
//! containment. If region A contains region B (B ⊆ A), then A *subsumes* B:
//! the more general concept contains the more specific one.
//!
//! # Getting Started
//!
//! | Goal | Start here |
//! |------|-----------|
//! | Understand the core abstraction | [`Box`] trait, [`BoxError`] |
//! | Use probabilistic (Gumbel) boxes | [`NdarrayGumbelBox`](ndarray_backend::NdarrayGumbelBox) |
//! | Use octagon embeddings (box + diagonal constraints) | [`NdarrayOctagon`](ndarray_backend::ndarray_octagon::NdarrayOctagon), [`octagon`] module |
//! | Fuzzy query answering (t-norms) | [`fuzzy::TNorm`], [`fuzzy::TConorm`], [`fuzzy`] module |
//! | Load a knowledge graph dataset | [`Dataset`], [`Triple`] |
//! | Train box embeddings (CPU) | [`BoxEmbeddingTrainer`], [`TrainingConfig`] |
//! | Train box embeddings (GPU) | `CandleBoxTrainer` (feature = `candle-backend`) |
//! | Evaluate with link prediction | [`evaluate_link_prediction`], `CandleBoxTrainer::evaluate` (feature = `candle-backend`) |
//!
//! # Why regions instead of points?
//!
//! Point embeddings (TransE, RotatE) work for link prediction but cannot encode
//! containment, volume, or set operations. Regions become necessary when the task
//! requires:
//!
//! - **Subsumption**: box A inside box B means A is-a B
//! - **Generality**: large volume = broad concept, small volume = specific
//! - **Intersection**: combining two concepts (A ∧ B) yields a valid region
//! - **Negation**: cone complement is another cone (FOL queries with ¬)
//!
//! For standard triple scoring, points are simpler and equally accurate. For
//! ontology completion (EL++), taxonomy expansion, and logical query answering,
//! regions are structurally required.
//!
//! # Key Concepts
//!
//! **Box embeddings** represent concepts as hyperrectangles. Unlike point vectors,
//! boxes have volume, which encodes generality: a broad concept ("animal") is a
//! large box containing smaller boxes ("dog", "cat").
//!
//! **Gumbel boxes** solve the *local identifiability problem* of hard boxes by
//! modeling coordinates as Gumbel random variables. This ensures dense gradients
//! throughout training -- hard boxes create flat regions where gradients vanish.
//!
//! **Containment probability** measures entailment (P(B ⊆ A)), while **overlap
//! probability** measures relatedness without strict hierarchy. These two scores
//! are the primary outputs of box embedding models.
//!
//! # Module Organization
//!
//! ## Core traits and geometry
//!
//! - [`box_trait`] -- the [`Box`] trait: containment, overlap, volume
//! - [`octagon`] -- octagon error types (implementations in [`ndarray_backend`])
//! - [`cone`] -- cone error types (implementations in [`ndarray_backend`])
//! - `hyperbolic` -- Poincare ball embeddings for tree-like hierarchies (feature-gated)
//! - `sheaf` -- sheaf neural networks for transitivity/consistency on graphs
//! - [`gaussian`] -- diagonal Gaussian box embeddings (KL, Bhattacharyya)
//!
//! ## Representations and scoring
//!
//! - [`distance`] -- Query2Box distance scoring
//! - [`fuzzy`] -- t-norms, t-conorms, and negation for fuzzy query answering (FuzzQE)
//!
//! ## Ontology and taxonomy
//!
//! - [`el`] -- EL++ ontology embedding primitives (Box2EL / TransBox)
//! - [`taxonomy`] -- TaxoBell-format taxonomy dataset loader
//! - [`taxobell`] -- TaxoBell combined training loss
//!
//! ## Training and evaluation
//!
//! - [`dataset`] -- load WN18RR, FB15k-237, YAGO3-10, and similar KG datasets
//! - [`trainable`] -- [`trainable::TrainableBox`] and [`trainable::TrainableCone`] with learnable parameters
//! - [`trainer`] -- negative sampling, loss computation, link prediction evaluation.
//! Includes `CandleBoxTrainer` for GPU training
//! with AdamW, cosine LR, self-adversarial NS, and filtered evaluation.
//! - [`metrics`] -- rank-based metrics (MRR, Hits@k, Mean Rank)
//! - [`optimizer`] -- AMSGrad state management
//! - [`utils`] -- numerical stability (log-space volume, stable sigmoid, Gumbel operations)
//!
//! ## Backends (feature-gated)
//!
//! - [`ndarray_backend`] -- `NdarrayBox`, `NdarrayGumbelBox`, distance functions
//! (feature = `ndarray-backend`, **on by default**)
//! - `candle_backend` -- `CandleBox`, `CandleGumbelBox` with GPU support
//! (feature = `candle-backend`)
//!
//! # Feature Flags
//!
//! | Feature | Default | Provides |
//! |---------|---------|----------|
//! | `ndarray-backend` | yes | [`ndarray_backend`] module (also enables `rand`) |
//! | `candle-backend` | no | `candle_backend` module (GPU via candle) |
//! | `cuda` | no | CUDA GPU support (implies `candle-backend`) |
//! | `rand` | yes (via `ndarray-backend`) | Negative sampling utilities in [`trainer`] |
//! | `kge` | yes | [`dataset`], [`metrics`], and `lattix_bridge` modules (KGE dataset loading, metrics) |
//! | `hyperbolic` | no | `hyperbolic` module (Poincare ball via `hyperball` + `skel`) |
//! | `petgraph` | no | `petgraph_adapter` module (convert petgraph graphs to datasets) |
//! | `sheaf` | no | `sheaf` module (sheaf diffusion primitives) |
//! | `rankops` | no | Re-exports `rankops` (rank fusion, nDCG, MAP) |
//! | `spherical` | no | `spherical` module (unit-sphere embeddings, experimental) |
//! | `density` | no | `density` + `density_el` modules (density matrix embeddings, experimental) |
//!
//! # Example
//!
//! ```rust,ignore
//! // Rename to avoid shadowing std::boxed::Box
//! use subsume::Box as BoxRegion;
//!
//! // Framework-agnostic: works with NdarrayBox, CandleBox, or your own impl
//! fn compute_entailment<B: BoxRegion>(
//! premise: &B,
//! hypothesis: &B,
//! ) -> Result<f32, subsume::BoxError> {
//! premise.containment_prob(hypothesis)
//! }
//! ```
//!
//! # References
//!
//! - Vilnis et al. (2018), "Probabilistic Embedding of Knowledge Graphs with Box Lattice Measures"
//! - Abboud et al. (2020), "BoxE: A Box Embedding Model for Knowledge Base Completion"
//! - Li et al. (2019), "Smoothing the Geometry of Probabilistic Box Embeddings" (ICLR 2019)
//! - Dasgupta et al. (2020), "Improving Local Identifiability in Probabilistic Box Embeddings"
//! - Chen et al. (2021), "Uncertainty-Aware Knowledge Graph Embeddings" (UKGE)
//! - Lee et al. (2022), "Box Embeddings for Event-Event Relation Extraction" (BERE)
//! - Cao et al. (2024, ACM Computing Surveys), "KG Embedding: A Survey from the
//! Perspective of Representation Spaces" -- positions box/cone/octagon embeddings
//! within the broader KGE taxonomy (Euclidean, hyperbolic, complex, geometric)
//! - Bourgaux et al. (2024, KR), "Knowledge Base Embeddings: Semantics and Theoretical Properties"
//! - Lacerda et al. (2024, TGDK), "Strong Faithfulness for ELH Ontology Embeddings"
//! - Yang & Chen (2025), "RegD: Achieving Hyperbolic-Like Expressiveness with Arbitrary
//! Euclidean Regions" -- source of the depth/boundary dissimilarity metrics in [`distance`]
// ---------------------------------------------------------------------------
// Core traits and geometry
// ---------------------------------------------------------------------------
/// Core [`Box`] trait: containment probability, overlap, volume, and intersection.
/// Cone embeddings: angular containment on the unit sphere, with negation support.
/// Octagon embeddings: axis-aligned polytopes with diagonal constraints (IJCAI 2024).
/// Knowledge graph dataset loading (WN18RR, FB15k-237, YAGO3-10, and similar formats).
/// Distance metrics: Query2Box distance scoring.
/// Poincare ball embeddings for tree-like hierarchical structures.
///
/// Requires the `hyperbolic` feature (uses `ndarray::ArrayView1` for
/// interoperability with the `hyperball` and `skel` crates).
/// AMSGrad optimizer state and learning rate utilities.
/// Sheaf neural networks: algebraic consistency enforcement on graphs.
/// Learnable box and cone representations with gradient-compatible parameters.
/// Training loop utilities: negative sampling, loss kernels, link prediction evaluation.
/// Rank-based evaluation metrics (MRR, Hits@k, Mean Rank).
/// Re-export rankops for rank fusion, IR evaluation (nDCG, MAP), and reranking.
pub use rankops;
/// Numerical stability: log-space volume, stable sigmoid, Gumbel operations.
/// Ball embeddings for subsumption via Euclidean containment.
///
/// Concepts are solid balls `(center, radius)` in R^d. Containment:
/// `||c_A - c_B|| + r_A <= r_B`. Supports SpherE-style relation transforms
/// (translate + scale) and RegD depth/boundary dissimilarity scoring.
///
/// References:
/// - SpherE (SIGIR 2024, arXiv:2404.19130): ball embeddings for set retrieval
/// - RegD (Jan 2025, arXiv:2501.17518): balls isometric to hyperbolic space
/// Spherical cap embeddings for subsumption on the unit sphere.
///
/// Concepts are regions on S^{d-1} defined by a center (unit vector)
/// and an angular radius. Containment: `angle(c_A, c_B) + theta_A <= theta_B`.
/// This is the spherical analog of ball containment in Euclidean space.
/// Subspace embeddings for logical operations (conjunction, disjunction, negation).
///
/// Concepts are linear subspaces of R^d, represented by orthonormal bases.
/// Containment via projection, intersection via common subspace, negation
/// via orthogonal complement.
///
/// Reference: Moreira et al. (2025), arXiv:2508.16687
/// Full-covariance Gaussian embeddings (rotated ellipsoids).
///
/// Concepts are multivariate Gaussians with full covariance, parameterized
/// via Cholesky decomposition. Supports KL divergence (asymmetric containment)
/// and Bhattacharyya distance (symmetric overlap).
/// TransBox: EL++-closed box embeddings with translational composition.
///
/// Concepts and roles as boxes with additive composition. Handles many-to-many
/// relations and complex role compositions while preserving EL++ semantics.
///
/// Reference: Yang, Chen, Sattler (2024), arXiv:2410.14571
/// Annular sector embeddings for knowledge graph completion.
///
/// Concepts as ring-shaped regions in the complex plane. Combines rotation-based
/// relations with region uncertainty, handling 1-N/N-1/N-N relations.
///
/// Reference: Zhu & Zeng (2025), arXiv:2506.11099
/// Diagonal Gaussian box embeddings for taxonomy expansion (TaxoBell).
/// Density matrix region embeddings (pure-state quantum embeddings).
///
/// Represents concepts as rank-1 density matrices in a complex Hilbert space.
/// Subsumption via Loewner order, distance via fidelity / Bures metric.
/// Reference: Garg et al. (2019), "Quantum Embedding of Knowledge for Reasoning" (NeurIPS).
/// Spherical knowledge graph embeddings on the unit sphere.
///
/// Entities are points on `S^{d-1}` (unit vectors). Relations are axis-angle
/// rotations. Scoring uses geodesic (great-circle) distance.
/// Density matrix EL++ training losses: NF1-NF4 and disjointness losses
/// using fidelity-based scoring on pure-state density matrices.
/// EL++ ontology embedding primitives (Box2EL / TransBox).
/// EL++ normalized axiom dataset loader (GALEN, GO, Anatomy formats).
/// EL++ ontology embedding primitives for cones (angular containment).
/// Composable cone query operators for first-order logical query answering.
/// EL++ ontology embedding training: axiom parsing, training loop, evaluation.
/// Fuzzy set-theoretic operators: t-norms, t-conorms, and negation (FuzzQE).
/// Taxonomy dataset loading for the TaxoBell format (`.terms` / `.taxo` / `dic.json`).
/// TaxoBell combined training loss for taxonomy expansion.
/// TaxoBell MLP encoder and training loop with candle autograd.
///
/// Requires the `candle-backend` feature.
// ---------------------------------------------------------------------------
// Re-exports: primary traits and types
// ---------------------------------------------------------------------------
/// The core box embedding trait. Start here.
///
/// This trait shares its name with [`std::boxed::Box`]. To avoid shadowing, use one of:
/// - `use subsume::Box as BoxRegion;` (recommended)
/// - Qualify calls as `subsume::Box` or `<T as subsume::Box>::method()`
pub use ;
/// Convenience alias for the [`Box`] trait that avoids shadowing [`std::boxed::Box`].
///
/// `use subsume::BoxRegion;` is equivalent to `use subsume::Box as BoxRegion;`.
pub use Box as BoxRegion;
// Re-exports: geometry errors
pub use ConeError;
pub use ;
pub use OctagonError;
pub use SheafError;
// Re-exports: data loading
pub use ;
// Re-exports: training (always available)
pub use ;
// Re-exports: training (requires kge feature)
pub use ;
// Re-export: CandleBoxTrainer (GPU training)
pub use CandleBoxTrainer;
// Re-export: ndarray (public dependency -- appears in NdarrayBox/NdarrayGumbelBox/NdarrayCone API)
pub use ndarray;
// Re-exports: evaluation metrics
pub use ;
// Re-exports: Ball embeddings
pub use ;
// Re-exports: Spherical cap embeddings
pub use ;
// Re-exports: Subspace embeddings
pub use ;
// Re-exports: Ellipsoid (full-covariance Gaussian) embeddings
pub use Ellipsoid;
// Re-exports: Annular sector embeddings
pub use ;
// Re-exports: TransBox embeddings
pub use ;
// Re-exports: Gaussian boxes
pub use GaussianBox;
// Re-exports: Density matrix embeddings
pub use DensityRegion;
// Re-exports: Spherical embeddings
pub use ;
// Re-exports: Density matrix EL++ training
pub use ;
// Re-exports: EL++ training
pub use ;
// Re-exports: cone EL++ primitives
pub use ;
// Re-exports: cone query operators
pub use ;
// ---------------------------------------------------------------------------
// Feature-gated backends
// ---------------------------------------------------------------------------
/// Ndarray backend: `NdarrayBox`, `NdarrayGumbelBox`, optimizer, and learning rate scheduler.
///
/// This is the default backend. Enable with `features = ["ndarray-backend"]`.
/// Candle backend: `CandleBox`, `CandleGumbelBox` with GPU acceleration.
///
/// Provides box and Gumbel box operations. Cone, octagon, and Gaussian
/// geometries are available through the ndarray backend only.
///
/// Enable with `features = ["candle-backend"]`.
/// Adapter for constructing datasets from [`petgraph`] graphs.
///
/// Requires the `petgraph` feature.
/// Bridge from [`lattix`] knowledge graphs to subsume datasets.
///
/// Converts lattix KGs (loaded from N-Triples, Turtle, CSV, JSON-LD)
/// into subsume datasets for training.