SQLite Knowledge Graph
A Rust library for building and querying knowledge graphs using SQLite as the backend, with graph algorithms and RAG support.
Features
Core Features
- Entity Management: Create, read, update, and delete typed entities with JSON properties
- Relation Storage: Define weighted relations between entities with graph traversal support
- Vector Search: Store embeddings and perform semantic search using cosine similarity
- Transaction Support: Batch operations with ACID guarantees
- SQLite Native: Full SQLite compatibility with bundling for portability
Graph Algorithms ✅
- Path-finding: BFS, DFS, Shortest Path algorithms
- Centrality: PageRank algorithm for importance ranking
- Community Detection: Louvain algorithm for graph clustering
- Connectivity: Connected components (weak and strong)
RAG Integration ✅
- Semantic Search: Vector similarity search
- Context Retrieval: Multi-hop context extraction
- Hybrid Search: Combine keyword and semantic search
SQLite Extension ✅
- Loadable Extension: Use as SQLite extension (.dylib/.so)
- SQL Functions: Graph algorithms exposed as SQL functions
kg_version()- Extension versionkg_stats()- Graph statisticskg_pagerank(damping, max_iterations, tolerance)- PageRank algorithmkg_louvain()- Community detectionkg_bfs(start_id, max_depth)- BFS traversalkg_shortest_path(from_id, to_id, max_depth)- Shortest pathkg_connected_components()- Connected components
- CLI Tool: Command-line interface for common operations
Installation
Note: This crate is not yet published to crates.io. Use git dependency or local path for now.
Add this to your Cargo.toml:
[]
= { = "https://github.com/hiyenwong/sqlite-knowledge-graph" }
Or for local development:
[]
= { = "../sqlite-knowledge-graph" }
Semantic Search Dependencies
Semantic search requires vector embeddings generated by sentence-transformers. Install with:
Default model: all-MiniLM-L6-v2 (384 dimensions, fast and accurate).
To generate embeddings for your knowledge graph:
Building SQLite Extension
# Extension will be at:
# target/release/libsqlite_knowledge_graph.dylib (macOS)
# target/release/libsqlite_knowledge_graph.so (Linux)
Quick Start
use ;
// Open or create a knowledge graph
let kg = open?;
// Create an entity with properties
let mut entity = new;
entity.set_property;
entity.set_property;
let paper_id = kg.insert_entity?;
// Create a relation
let relation = new?;
kg.insert_relation?;
// Graph traversal (BFS/DFS)
let neighbors = kg.get_neighbors?;
// Shortest path between entities
let path = kg.kg_shortest_path?;
// PageRank centrality
let pagerank = kg.kg_pagerank?;
// Louvain community detection
let communities = kg.kg_louvain?;
// Connected components
let components = kg.kg_connected_components?;
// Vector search for similar entities
let embedding = vec!;
kg.insert_vector?;
let results = kg.search_vectors?;
API Overview
KnowledgeGraph
The main entry point for the library.
Graph Algorithms
PageRank
use PageRankConfig;
let config = PageRankConfig ;
let rankings = kg.kg_pagerank?;
for in rankings.iter.take
Louvain Community Detection
let result = kg.kg_louvain?;
println!;
println!;
for in result.memberships
Connected Components
let components = kg.kg_connected_components?;
println!;
println!;
CLI Tool
# Show statistics
# Search entities
# Get entity context
# Migrate data
SQLite Extension Usage
-- Load extension
SELECT load_extension('./libsqlite_knowledge_graph', 'sqlite3_sqlite_knowledge_graph_init');
-- Get version
SELECT kg_version;
-- Returns: "0.7.0"
-- Get stats
SELECT kg_stats;
-- Returns: JSON with graph statistics
-- PageRank (optional parameters: damping, max_iterations, tolerance)
SELECT kg_pagerank;
SELECT kg_pagerank(0.85); -- with custom damping
SELECT kg_pagerank(0.85, 100); -- with custom damping and iterations
SELECT kg_pagerank(0.85, 100, 1e-6); -- full parameters
-- Returns: JSON with algorithm info and note to use Rust API for full results
-- Louvain community detection
SELECT kg_louvain;
-- Returns: JSON with algorithm info
-- BFS traversal (required: start_id, optional: max_depth)
SELECT kg_bfs(1);
SELECT kg_bfs(1, 3);
-- Returns: JSON with algorithm parameters
-- Shortest path (required: from_id, to_id, optional: max_depth)
SELECT kg_shortest_path(1, 5);
SELECT kg_shortest_path(1, 5, 10);
-- Returns: JSON with path parameters
-- Connected components
SELECT kg_connected_components;
-- Returns: JSON with algorithm info
-- Graph search example
WITH neural_papers AS (
SELECT id, name FROM kg_entities
WHERE entity_type = 'paper'
AND name LIKE '%neural network%'
)
SELECT e.name, r.rel_type
FROM neural_papers np
JOIN kg_relations r ON r.source_id = np.id
JOIN kg_entities e ON r.target_id = e.id
WHERE e.entity_type = 'skill'
LIMIT 10;
Database Schema
kg_entities
(
id INTEGER PRIMARY KEY AUTOINCREMENT,
entity_type TEXT NOT NULL,
name TEXT NOT NULL,
properties TEXT, -- JSON
created_at INTEGER,
updated_at INTEGER
);
(entity_type);
(name);
kg_relations
(
id INTEGER PRIMARY KEY AUTOINCREMENT,
source_id INTEGER NOT NULL,
target_id INTEGER NOT NULL,
rel_type TEXT NOT NULL,
weight REAL DEFAULT 1.0,
properties TEXT, -- JSON
created_at INTEGER,
FOREIGN KEY (source_id) REFERENCES kg_entities(id) ON DELETE CASCADE,
FOREIGN KEY (target_id) REFERENCES kg_entities(id) ON DELETE CASCADE
);
(source_id);
(target_id);
(rel_type);
kg_vectors
(
entity_id INTEGER NOT NULL PRIMARY KEY,
vector BLOB NOT NULL,
dimension INTEGER NOT NULL,
created_at INTEGER,
FOREIGN KEY (entity_id) REFERENCES kg_entities(id) ON DELETE CASCADE
);
Performance
Benchmarks on a knowledge graph with 2,619 entities and 1.48M relations:
| Operation | Time |
|---|---|
| Entity insert | < 1ms |
| Relation insert | < 1ms |
| BFS (depth 3) | ~50ms |
| PageRank | ~200ms |
| Louvain | ~500ms |
| Vector search (k=10) | ~10ms |
Implementation Status
| Feature | Status |
|---|---|
| Entity/Relation CRUD | ✅ Complete |
| Graph Traversal (BFS/DFS) | ✅ Complete |
| Shortest Path | ✅ Complete |
| PageRank | ✅ Complete |
| Louvain Community Detection | ✅ Complete |
| Connected Components | ✅ Complete |
| Vector Storage | ✅ Complete |
| Semantic Search | ✅ Complete |
| RAG Integration | ✅ Complete |
| SQLite Extension | ✅ Complete |
| CLI Tool | ✅ Complete |
| GitHub Actions CI | ✅ Complete |
| More Extension Functions | ✅ Complete (v0.7.0) |
| Vector Indexing (TurboQuant) | ✅ Complete (v0.8.0) |
| Higher-order Relations | ⏳ Planned |
| Graph Visualization Export | ⏳ Planned |
| Async API | ⏳ Planned |
Testing
# Run all tests
# Run with verbose output
# Run specific test
Current test coverage: 38 tests passing
Projects Using This Library
- OpenClaw Knowledge Base: 2,497 papers, 122 skills, 1.48M relations
- Research Paper Analysis: Graph-based paper discovery
License
MIT License
Contributing
Contributions are welcome! Please open an issue or submit a pull request.
Acknowledgments
Built with:
- rusqlite - SQLite bindings
- sqlite-loadable - SQLite extension support
- serde - Serialization framework
- thiserror - Error handling
Changelog
See CHANGELOG.md for version history.