1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
//! Knowledge graph substrate: core types, algorithms, and formats.
//!
//! This crate provides foundational types for working with knowledge graphs:
//!
//! - [`Triple`] - A (subject, predicate, object) triple
//! - [`Entity`] - A node in the knowledge graph
//! - [`Relation`] - An edge type in the knowledge graph
//! - [`KnowledgeGraph`] - A homogeneous graph structure built from triples
//! - [`HeteroGraph`] - A heterogeneous graph with typed nodes and edges
//!
//! # Historical Context: From Databases to Graphs
//!
//! | Era | Representation | Example | Limitation |
//! |-----|----------------|---------|------------|
//! | 1970s | Relational | SQL tables | Fixed schema, join-heavy |
//! | 2001 | RDF | Semantic Web | XML verbosity, query complexity |
//! | 2012 | Property Graphs | Neo4j | No standard query language |
//! | 2019 | GNN-ready graphs | PyG, DGL | Framework-specific formats |
//!
//! Knowledge graphs became essential when search engines realized that
//! "things, not strings" (Google, 2012) captures real-world semantics
//! that keyword matching misses.
//!
//! # The Triple: Atomic Unit of Knowledge
//!
//! All knowledge graph formalisms reduce to the **triple**:
//!
//! ```text
//! (subject, predicate, object)
//! (Albert Einstein, born_in, Ulm)
//! (Ulm, located_in, Germany)
//! ```
//!
//! This simple structure enables:
//! - **Inference**: If A → B → C, deduce A → C
//! - **Composition**: Merge graphs by merging shared entities
//! - **Embedding**: Represent triples as vectors for ML
//!
//! # Beyond Triples: N-ary Relations and Hypergraphs
//!
//! Triples are limited - they can only express binary relations. Real-world
//! facts often involve more than two entities:
//!
//! ```text
//! (Einstein, won, Nobel Prize, Physics, 1921) -- 4 entities!
//! (Alice, purchased, Book, $20, Amazon, 2024-01-15) -- 6 entities!
//! ```
//!
//! Three approaches handle this:
//!
//! | Approach | Structure | Example | Trade-off |
//! |----------|-----------|---------|-----------|
//! | **Reification** | Break into multiple triples | Creates artificial nodes | Information loss |
//! | **Qualifiers** | Triple + key-value pairs | Wikidata model | Complex querying |
//! | **Hyperedges** | N-ary relation directly | Native structure | New embeddings needed |
//!
//! ## Reification (Workaround)
//!
//! Convert n-ary facts to binary by introducing intermediate nodes:
//!
//! ```text
//! Original: (Einstein, won, Nobel, Physics, 1921)
//! Reified: (Award_1, recipient, Einstein)
//! (Award_1, prize, Nobel)
//! (Award_1, field, Physics)
//! (Award_1, year, 1921)
//! ```
//!
//! **Problem**: Loses the atomic nature of the fact. Embedding models struggle
//! because Award_1 is artificial - it has no semantic meaning.
//!
//! ## Hyper-relational KGs (Wikidata Style)
//!
//! Attach qualifiers to triples:
//!
//! ```text
//! (Einstein, won, Nobel Prize)
//! qualifiers: {field: Physics, year: 1921}
//! ```
//!
//! Implemented in [`hyper::HyperTriple`]. Embeddings: StarE, HINGE.
//!
//! ## Knowledge Hypergraphs (Native N-ary)
//!
//! Represent facts as hyperedges connecting multiple entities with roles:
//!
//! ```text
//! HyperEdge {
//! relation: "award_ceremony",
//! bindings: {
//! recipient: Einstein,
//! prize: Nobel,
//! field: Physics,
//! year: 1921
//! }
//! }
//! ```
//!
//! Implemented in [`hyper::HyperEdge`]. Embeddings: HSimplE, HypE, HyCubE.
//!
//! **Key insight**: Position-aware or role-aware encoding is essential.
//! The relation "recipient" carries different semantics than "year".
//!
//! See [`hyper`] module for hypergraph types.
//!
//! # Homogeneous vs Heterogeneous Graphs
//!
//! | Type | Nodes | Edges | Use Case |
//! |------|-------|-------|----------|
//! | Homogeneous | One type | One type | Citation networks, social graphs |
//! | Heterogeneous | Multiple types | Multiple types | Knowledge graphs, biomedical |
//!
//! [`HeteroGraph`] supports typed nodes and edges, essential for:
//! - **RGCN**: Relation-specific weight matrices
//! - **HGT**: Heterogeneous graph transformers
//! - **Link prediction**: Typed edge prediction
//!
//! # Serialization Formats (requires `formats` feature)
//!
//! Supports modern RDF 1.2 specifications (2024):
//! - N-Triples - Line-based, simple (fastest parsing)
//! - N-Quads - N-Triples with named graphs
//! - Turtle - Human-readable (best for debugging)
//! - JSON-LD - Linked data (web integration)
//!
//! # Algorithms (requires `algo` feature)
//!
//! ## Centrality ([`algo::centrality`])
//!
//! | Algorithm | Question | Module |
//! |-----------|----------|--------|
//! | Degree | How many connections? | [`algo::centrality::degree_centrality`] |
//! | Betweenness | Bridge between communities? | [`algo::centrality::betweenness_centrality`] |
//! | Closeness | How close to everyone? | [`algo::centrality::closeness_centrality`] |
//! | Eigenvector | Connected to important nodes? | [`algo::centrality::eigenvector_centrality`] |
//! | Katz | Reachable via damped paths? | [`algo::centrality::katz_centrality`] |
//! | PageRank | Random walk equilibrium? | [`algo::pagerank::pagerank`] |
//! | HITS | Hub or authority? | [`algo::centrality::hits`] |
//!
//! ## Other Algorithms
//!
//! - [`algo::random_walk`] - Node2Vec style random walks (biased BFS/DFS)
//! - [`algo::components`] - Connected components (graph structure)
//! - [`algo::sampling`] - Neighbor sampling for mini-batch GNN training
//!
//! # When to Use Which Structure
//!
//! | Task | Structure | Why |
//! |------|-----------|-----|
//! | Node classification | KnowledgeGraph | Homogeneous GCN/GAT |
//! | Link prediction | HeteroGraph | Relation types matter |
//! | Knowledge completion | HeteroGraph + embeddings | TransE, RotatE, BoxE |
//! | Graph classification | KnowledgeGraph | Global pooling over nodes |
//!
//! # Example
//!
//! ```rust
//! use lattix::{Triple, KnowledgeGraph};
//!
//! let mut kg = KnowledgeGraph::new();
//!
//! // Add triples
//! kg.add_triple(Triple::new("Apple", "founded_by", "Steve Jobs"));
//! kg.add_triple(Triple::new("Apple", "headquartered_in", "Cupertino"));
//! kg.add_triple(Triple::new("Steve Jobs", "born_in", "San Francisco"));
//!
//! // Query
//! let apple_relations = kg.relations_from("Apple");
//! assert_eq!(apple_relations.len(), 2);
//! ```
/// Benchmark data and evaluation for knowledge graph embedding experiments.
pub use ;
pub use ;
pub use ;
pub use ;
pub use ;
pub use ;
pub use TripleQuery;
pub use ;
pub use Triple;
// Re-export petgraph for advanced graph operations.
//
// Ideally users would depend on petgraph directly, but removing this re-export
// is a breaking change because public API methods expose petgraph types:
// - KnowledgeGraph::as_petgraph() -> &DiGraph<Entity, Relation>
// - KnowledgeGraph::get_node_index() -> Option<petgraph::graph::NodeIndex>
// Removing the re-export requires wrapping these return types first.
// A petgraph major bump is also a lattix breaking change.
pub use petgraph;
/// Load all N-Triples files from a directory (e.g., anno export directory).
///
/// Merges all .nt files into a single KnowledgeGraph.
///
/// # Example
///
/// ```rust,ignore
/// // After: anno export -i docs/ -o kg/ --format ntriples
/// let kg = lattix::load_anno_exports("kg/")?;
/// ```