terraphim_rolegraph 1.4.10

Terraphim rolegraph module, which provides role handling for Terraphim AI.
Documentation

terraphim_rolegraph

Crates.io Documentation License

Knowledge graph implementation for semantic document search.

Overview

terraphim_rolegraph provides a role-specific knowledge graph that connects concepts, relationships, and documents for graph-based semantic search. Results are ranked by traversing relationships between matched concepts.

Features

  • 📊 Graph-Based Search: Navigate concept relationships for smarter results
  • 🔍 Multi-Pattern Matching: Fast Aho-Corasick text scanning
  • 🎯 Semantic Ranking: Sum node + edge + document ranks
  • 🔗 Path Connectivity: Check if matched terms connect via graph paths
  • ⚡ High Performance: O(n) matching, efficient graph traversal
  • 🎭 Role-Specific: Separate graphs for different user personas

Installation

[dependencies]
terraphim_rolegraph = "1.0.0"

Quick Start

Creating and Querying a Graph

use terraphim_rolegraph::RoleGraph;
use terraphim_types::{RoleName, Thesaurus, NormalizedTermValue, NormalizedTerm, Document};

#[tokio::main]
async fn main() -> Result<(), terraphim_rolegraph::Error> {
    // Create thesaurus
    let mut thesaurus = Thesaurus::new("engineering".to_string());
    thesaurus.insert(
        NormalizedTermValue::from("rust"),
        NormalizedTerm {
            id: 1,
            value: NormalizedTermValue::from("rust programming"),
            url: Some("https://rust-lang.org".to_string()),
        }
    );
    thesaurus.insert(
        NormalizedTermValue::from("async"),
        NormalizedTerm {
            id: 2,
            value: NormalizedTermValue::from("asynchronous programming"),
            url: Some("https://rust-lang.github.io/async-book/".to_string()),
        }
    );

    // Create role graph
    let mut graph = RoleGraph::new(
        RoleName::new("engineer"),
        thesaurus
    ).await?;

    // Index documents
    let doc = Document {
        id: "rust-async-guide".to_string(),
        title: "Async Rust Programming".to_string(),
        body: "Learn rust and async programming with tokio".to_string(),
        url: "https://example.com/rust-async".to_string(),
        description: Some("Comprehensive async Rust guide".to_string()),
        summarization: None,
        stub: None,
        tags: Some(vec!["rust".to_string(), "async".to_string()]),
        rank: None,
        source_haystack: None,
    };
    let doc_id = doc.id.clone();
    graph.insert_document(&doc_id, doc);

    // Query the graph
    let results = graph.query_graph("rust async", None, Some(10))?;
    for (id, indexed_doc) in results {
        println!("Document: {} (rank: {})", id, indexed_doc.rank);
    }

    Ok(())
}

Path Connectivity Checking

use terraphim_rolegraph::RoleGraph;
use terraphim_types::{RoleName, Thesaurus};

#[tokio::main]
async fn main() -> Result<(), terraphim_rolegraph::Error> {
    let thesaurus = Thesaurus::new("engineering".to_string());
    let graph = RoleGraph::new(RoleName::new("engineer"), thesaurus).await?;

    // Check if matched terms are connected by a graph path
    let text = "rust async tokio programming";
    let connected = graph.is_all_terms_connected_by_path(text);

    if connected {
        println!("All terms are connected - they form a coherent topic!");
    } else {
        println!("Terms are disconnected - possibly unrelated concepts");
    }

    Ok(())
}

Multi-term Queries with Operators

use terraphim_rolegraph::RoleGraph;
use terraphim_types::{RoleName, Thesaurus, LogicalOperator};

#[tokio::main]
async fn main() -> Result<(), terraphim_rolegraph::Error> {
    let thesaurus = Thesaurus::new("engineering".to_string());
    let mut graph = RoleGraph::new(RoleName::new("engineer"), thesaurus).await?;

    // AND query - documents must contain ALL terms
    let results = graph.query_graph_with_operators(
        &["rust", "async", "tokio"],
        &LogicalOperator::And,
        None,
        Some(10)
    )?;
    println!("AND query: {} results", results.len());

    // OR query - documents may contain ANY term
    let results = graph.query_graph_with_operators(
        &["rust", "python", "go"],
        &LogicalOperator::Or,
        None,
        Some(10)
    )?;
    println!("OR query: {} results", results.len());

    Ok(())
}

Architecture

Graph Structure

The knowledge graph uses a three-layer structure:

  1. Nodes (Concepts)

    • Represent terms from the thesaurus
    • Track frequency/importance (rank)
    • Connect to related concepts via edges
  2. Edges (Relationships)

    • Connect concepts that co-occur in documents
    • Weighted by co-occurrence strength (rank)
    • Associate documents via concept pairs
  3. Documents (Content)

    • Indexed by concepts they contain
    • Linked via edges between their concepts
    • Ranked by node + edge + document scores

Ranking Algorithm

Search results are ranked by summing:

total_rank = node_rank + edge_rank + document_rank
  • node_rank: How important/frequent the concept is
  • edge_rank: How strong the concept relationship is
  • document_rank: Document-specific relevance

Higher total rank = more relevant result.

Performance Characteristics

  • Text Matching: O(n) with Aho-Corasick multi-pattern matching
  • Graph Query: O(k × e × d) where:
    • k = number of matched terms
    • e = average edges per node
    • d = average documents per edge
  • Memory: ~100 bytes/node + ~200 bytes/edge
  • Connectivity Check: DFS with backtracking (exponential worst case, fast for k≤8)

API Overview

Core Methods

  • RoleGraph::new() - Create graph from thesaurus
  • insert_document() - Index a document
  • query_graph() - Simple text query
  • query_graph_with_operators() - Multi-term query with AND/OR
  • is_all_terms_connected_by_path() - Check path connectivity
  • find_matching_node_ids() - Get matched concept IDs

Graph Inspection

  • get_graph_stats() - Statistics (node/edge/document counts)
  • get_node_count() / get_edge_count() / get_document_count()
  • is_graph_populated() - Check if graph has content
  • validate_documents() - Find orphaned documents
  • find_document_ids_for_term() - Reverse lookup

Async Support

The graph uses tokio::sync::Mutex for thread-safe async operations:

use terraphim_rolegraph::RoleGraphSync;

let sync_graph = RoleGraphSync::new(graph);
let locked = sync_graph.lock().await;
let results = locked.query_graph("search term", None, Some(10))?;

Utility Functions

Text Processing

  • split_paragraphs() - Split text into paragraphs

Node ID Pairing

  • magic_pair(x, y) - Create unique edge ID from two node IDs
  • magic_unpair(z) - Extract node IDs from edge ID

Examples

See the examples/ directory for:

  • Building graphs from markdown files
  • Multi-role graph management
  • Custom ranking strategies
  • Path analysis and connectivity

Minimum Supported Rust Version (MSRV)

This crate requires Rust 1.70 or later.

License

Licensed under Apache-2.0. See LICENSE for details.

Related Crates

Support