Expand description
§OxiRS GraphRAG
GraphRAG (Graph Retrieval-Augmented Generation) is a production-ready Rust library that combines knowledge-graph topology traversal with vector similarity search to deliver context-rich answers for LLM pipelines — without any network dependencies at query time.
It is the JVM-free, pure-Rust counterpart of Microsoft’s GraphRAG and LangChain’s knowledge-graph QA stack, integrated directly with the OxiRS semantic-web engine.
§Data-flow overview
Natural-Language Query
│
▼
┌───────────────────┐
│ Query Embedding │ (oxirs-embed / Node2Vec / TransE)
└────────┬──────────┘
│
┌──────┴──────┐
│ │
▼ ▼
Vector Keyword
KNN BM25
Search Search
│ │
└──────┬──────┘
│
▼
┌───────────────┐
│ RRF Fusion │ Reciprocal Rank Fusion → Seed Entities
└───────┬───────┘
│
▼
┌────────────────────────┐
│ SPARQL N-hop Expansion│ Graph traversal (up to 500 triples)
└────────────┬───────────┘
│
▼
┌────────────────────────┐
│ Community Detection │ Louvain / Leiden clustering
└────────────┬───────────┘
│
▼
┌────────────────────────┐
│ Context Building │ Subgraph → natural-language context
└────────────┬───────────┘
│
▼
┌────────────────────────┐
│ LLM Generation │ Answer + citations
└────────────────────────┘§Key modules
| Module | Purpose |
|---|---|
triple_extractor | Rule-based NLP → RDF triple extraction |
community_detector | Greedy label-propagation community detection |
path_finder | BFS / DFS shortest-path retrieval in KGs |
graph_embedder | Node2Vec-style random-walk structural embeddings |
summarizer | Cluster-based subgraph summarization for LLM context |
path_ranker | Predicate-weighted path ranking |
context_builder | N-hop subgraph extraction and truncation |
knowledge_fusion | Multi-source KG fusion with provenance |
graph_summarization | PageRank-style community summary generation |
entity_linking | Entity linking and disambiguation |
explainability | Attention weights, path explanation, provenance |
feedback | Session-scoped user-feedback weight adaptation |
graph | Core community detection and graph traversal primitives |
retrieval | Hybrid vector + keyword retrieval with RRF fusion |
generation | Prompt templates and LLM context building |
temporal | Temporal knowledge graph retrieval |
§Quickstart — standalone pipeline (no network, no LLM)
The example below runs an end-to-end mini-pipeline entirely in memory on a synthetic 8-node knowledge graph: extract triples from text, detect communities, find paths, and summarize the result.
use oxirs_graphrag::triple_extractor::{ExtractionConfig, TripleExtractor};
use oxirs_graphrag::community_detector::{CommunityGraph, CommunityDetector};
use oxirs_graphrag::path_finder::{KnowledgeEdge, PathFinder, PathFinderConfig};
use oxirs_graphrag::summarizer::{KgEdge, KgNode, KgSubgraph, SubgraphSummarizer};
// ── Step 1: Extract triples from natural language ─────────────────────────
let corpus = [
"Alice is a data scientist.",
"Bob works at ACME.",
"Carol is a software engineer.",
"Dave is part of the AI team.",
"ACME has a research division.",
];
let extractor = TripleExtractor::with_defaults(ExtractionConfig::default());
let all_triples: Vec<_> = corpus
.iter()
.flat_map(|sentence| extractor.extract(sentence))
.collect();
assert!(!all_triples.is_empty(), "at least one triple extracted");
// ── Step 2: Build community graph and detect clusters ─────────────────────
let mut cg = CommunityGraph::new();
// 8 synthetic nodes
for (id, label) in [
(1u64, "Alice"), (2, "Bob"), (3, "Carol"), (4, "Dave"),
(5, "ACME"), (6, "AI-Team"), (7, "Research"), (8, "Berlin"),
] {
cg.add_node(id, label);
}
for (a, b) in [(1,5),(2,5),(3,6),(4,6),(5,7),(6,7),(7,8),(1,2)] {
cg.add_edge(a, b, 1.0);
}
let detector = CommunityDetector::new(2, 50);
let detection = detector.detect(&mut cg);
assert!(!detection.communities.is_empty(), "at least one community");
// ── Step 3: Graph path retrieval ──────────────────────────────────────────
let edges = vec![
KnowledgeEdge::new("Alice", "works_at", "ACME"),
KnowledgeEdge::new("ACME", "located_in", "Berlin"),
KnowledgeEdge::new("Bob", "knows", "Alice"),
KnowledgeEdge::new("Alice", "member_of", "AI-Team"),
KnowledgeEdge::new("AI-Team", "part_of", "ACME"),
KnowledgeEdge::new("Carol", "works_at", "ACME"),
KnowledgeEdge::new("Dave", "leads", "AI-Team"),
KnowledgeEdge::new("Research", "division_of", "ACME"),
];
let finder = PathFinder::new(edges, PathFinderConfig::default());
let paths = finder.bfs_paths("Bob", "Berlin", 4);
assert!(!paths.is_empty(), "path Bob→Berlin found");
// ── Step 4: Summarize subgraph for LLM context ────────────────────────────
let mut subgraph = KgSubgraph::new();
for (id, label, ty) in [
("alice", "Alice", "Person"),
("bob", "Bob", "Person"),
("carol", "Carol", "Person"),
("acme", "ACME", "Organization"),
("berlin", "Berlin", "Place"),
("ai_team", "AI-Team", "Team"),
("research", "Research", "Department"),
("dave", "Dave", "Person"),
] {
subgraph.add_node(KgNode::simple(id, label, ty));
}
subgraph.add_edge(KgEdge::unweighted("alice", "acme", "works_at"));
subgraph.add_edge(KgEdge::unweighted("acme", "berlin","located_in"));
let summarizer = SubgraphSummarizer::new();
let clusters = summarizer.summarize(&subgraph, 10);
assert!(!clusters.is_empty(), "at least one cluster");
let text_summary = summarizer.generate_text_summary(&clusters);
assert!(!text_summary.is_empty(), "non-empty summary text");§Full engine usage (async, requires trait impls)
For production usage with a real vector index, embedding model, SPARQL engine, and LLM client:
use oxirs_graphrag::{GraphRAGEngine, GraphRAGConfig};
use std::sync::Arc;
let config = GraphRAGConfig {
top_k: 20,
expansion_hops: 2,
enable_communities: true,
..Default::default()
};
// Provide your own implementations of VectorIndexTrait, EmbeddingModelTrait,
// SparqlEngineTrait, and LlmClientTrait:
let engine = GraphRAGEngine::new(
Arc::new(my_vec_index),
Arc::new(my_embedder),
Arc::new(my_sparql),
Arc::new(my_llm),
config,
);
let result = engine.query("What safety issues affect battery cells?").await?;
println!("Answer: {}", result.answer);
println!("Confidence: {:.2}", result.confidence);See docs/tutorial.md
for a step-by-step walkthrough.
Re-exports§
pub use summarizer::GraphSummarizer;pub use summarizer::GraphSummary;pub use feedback::Relevance;pub use feedback::TripleId;pub use feedback::TripleRelevanceFeedback;pub use gnn_encoder::AdjacencyGraph;pub use gnn_encoder::EdgeList;pub use gnn_encoder::GnnEncoder;pub use gnn_encoder::GnnEncoderConfig;pub use gnn_encoder::ScaledDotProductAttention;pub use cache::query_cache::CacheEntry;pub use cache::query_cache::CacheStats;pub use cache::query_cache::QueryCache;pub use cache::query_cache::QueryCacheConfig;pub use config::CacheConfiguration;pub use config::GraphRAGConfig;pub use embeddings::node2vec::Node2VecConfig;pub use embeddings::node2vec::Node2VecEmbedder;pub use embeddings::node2vec::Node2VecEmbeddings;pub use embeddings::node2vec::Node2VecWalkConfig;pub use graph::community::CommunityAlgorithm;pub use graph::community::CommunityConfig;pub use graph::community::CommunityDetector;pub use graph::embeddings::CommunityAwareEmbeddings;pub use graph::embeddings::CommunityStructure;pub use graph::embeddings::EmbeddingConfig;pub use graph::traversal::GraphTraversal;pub use hybrid::lora::LoraAdapter;pub use hybrid::lora::LoraTrainer;pub use query::planner::QueryPlanner;pub use retrieval::fusion::FusionStrategy;pub use model_loader::GgufMetadata;pub use model_loader::GgufModelArch;pub use model_loader::GgufParseError;pub use model_loader::GgufParser;pub use model_loader::GgufTensorInfo;pub use model_loader::GgufValue;pub use model_loader::ModelHandle;pub use model_loader::ModelInfo;pub use model_loader::ModelRegistry;pub use model_loader::RegistryError;
Modules§
- cache
- Cache module for GraphRAG query results
- community_
detector - Graph community detection using a greedy label-propagation approach.
- config
- GraphRAG configuration
- context_
builder - Context building for graph-based RAG.
- distributed
- Distributed GraphRAG: federated subgraph expansion across multiple SPARQL endpoints.
- embeddings
- Graph embedding algorithms for GraphRAG.
- entity_
classifier - Entity type classification for knowledge graph nodes.
- entity_
linker - String-to-RDF entity linking: mention detection and candidate ranking.
- entity_
linking - Entity linking and disambiguation for knowledge graphs.
- explainability
- Explainability engine for graph-based RAG — attention weights, path explanation, provenance.
- federation
- Federation layer for distributed GraphRAG queries.
- feedback
- Interactive feedback loop for graph-based RAG retrieval refinement.
- fusion
- Fusion and reranking module for GraphRAG
- generation
- Answer generation module
- gnn_
encoder - GraphSAGE encoder for knowledge-graph entity embeddings.
- graph
- Graph processing module
- graph_
embedder - Graph Embedder
- graph_
partitioner - Graph partitioning using greedy and label-propagation methods.
- graph_
summarization - Graph Summarization for GraphRAG
- hybrid
- Hybrid GNN+LLM architecture — phases b, c, and d.
- knowledge_
fusion - Multi-source knowledge fusion.
- model_
loader - Pure-Rust GGUF model metadata loader and thread-safe model registry.
- neuro_
symbolic - Neuro-symbolic module: physics-informed entity scoring for knowledge graphs.
- path_
finder - Path Finder for Graph-RAG
- path_
ranker - Knowledge Graph Path Ranker
- query
- Query processing module
- reasoning
- Reasoning module for GraphRAG
- retrieval
- Retrieval module for GraphRAG
- sparql
- SPARQL extension functions for GraphRAG
- streaming
- Streaming subgraph extraction using SPARQL-like patterns.
- summarizer
- Knowledge Graph Subgraph Summarizer
- temporal
- Temporal reasoning and time-aware retrieval for GraphRAG
- transe_
model - TransE Knowledge Graph Embedding Model
- triple_
extractor - Triple Extractor
Structs§
- Cache
Config - Cache configuration
- Community
Summary - Community summary for hierarchical retrieval
- GraphRAG
Engine - Main GraphRAG engine
- GraphRAG
Result2 - GraphRAG query result
- Query
Provenance - Query provenance for attribution
- Scored
Entity - Entity with relevance score
- Triple
- Triple representation for RDF data
Enums§
- GraphRAG
Error - GraphRAG error types
- Score
Source - Source of entity score
Traits§
- Embedding
Model Trait - Trait for embedding model operations
- LlmClient
Trait - Trait for LLM client operations
- Sparql
Engine Trait - Trait for SPARQL engine operations
- Vector
Index Trait - Trait for vector index operations