terraphim_rolegraph
Knowledge graph implementation for semantic document search.
Overview
terraphim_rolegraph provides a role-specific knowledge graph that connects concepts, relationships, and documents for graph-based semantic search. Results are ranked by traversing relationships between matched concepts.
Features
- 📊 Graph-Based Search: Navigate concept relationships for smarter results
- 🔍 Multi-Pattern Matching: Fast Aho-Corasick text scanning
- 🎯 Semantic Ranking: Sum node + edge + document ranks
- 🔗 Path Connectivity: Check if matched terms connect via graph paths
- ⚡ High Performance: O(n) matching, efficient graph traversal
- 🎭 Role-Specific: Separate graphs for different user personas
Installation
[]
= "1.0.0"
Quick Start
Creating and Querying a Graph
use RoleGraph;
use ;
async
Path Connectivity Checking
use RoleGraph;
use ;
async
Multi-term Queries with Operators
use RoleGraph;
use ;
async
Architecture
Graph Structure
The knowledge graph uses a three-layer structure:
-
Nodes (Concepts)
- Represent terms from the thesaurus
- Track frequency/importance (rank)
- Connect to related concepts via edges
-
Edges (Relationships)
- Connect concepts that co-occur in documents
- Weighted by co-occurrence strength (rank)
- Associate documents via concept pairs
-
Documents (Content)
- Indexed by concepts they contain
- Linked via edges between their concepts
- Ranked by node + edge + document scores
Ranking Algorithm
Search results are ranked by summing:
total_rank = node_rank + edge_rank + document_rank
- node_rank: How important/frequent the concept is
- edge_rank: How strong the concept relationship is
- document_rank: Document-specific relevance
Higher total rank = more relevant result.
Performance Characteristics
- Text Matching: O(n) with Aho-Corasick multi-pattern matching
- Graph Query: O(k × e × d) where:
- k = number of matched terms
- e = average edges per node
- d = average documents per edge
- Memory: ~100 bytes/node + ~200 bytes/edge
- Connectivity Check: DFS with backtracking (exponential worst case, fast for k≤8)
API Overview
Core Methods
RoleGraph::new()- Create graph from thesaurusinsert_document()- Index a documentquery_graph()- Simple text queryquery_graph_with_operators()- Multi-term query with AND/ORis_all_terms_connected_by_path()- Check path connectivityfind_matching_node_ids()- Get matched concept IDs
Graph Inspection
get_graph_stats()- Statistics (node/edge/document counts)get_node_count()/get_edge_count()/get_document_count()is_graph_populated()- Check if graph has contentvalidate_documents()- Find orphaned documentsfind_document_ids_for_term()- Reverse lookup
Async Support
The graph uses tokio::sync::Mutex for thread-safe async operations:
use RoleGraphSync;
let sync_graph = new;
let locked = sync_graph.lock.await;
let results = locked.query_graph?;
Utility Functions
Text Processing
split_paragraphs()- Split text into paragraphs
Node ID Pairing
magic_pair(x, y)- Create unique edge ID from two node IDsmagic_unpair(z)- Extract node IDs from edge ID
Examples
See the examples/ directory for:
- Building graphs from markdown files
- Multi-role graph management
- Custom ranking strategies
- Path analysis and connectivity
Minimum Supported Rust Version (MSRV)
This crate requires Rust 1.70 or later.
License
Licensed under Apache-2.0. See LICENSE for details.
Related Crates
- terraphim_types: Core type definitions
- terraphim_automata: Text matching and autocomplete
- terraphim_service: Main service layer with search
Support
- Discord: https://discord.gg/VPJXB6BGuY
- Discourse: https://terraphim.discourse.group
- Issues: https://github.com/terraphim/terraphim-ai/issues