Table of Contents
Quick Start
CPU Algorithms (Default)
use ;
# async
GPU Algorithms (Feature Flag)
Enable GPU acceleration for 10-250× speedups:
[]
= { = "0.1", = ["gpu"] }
use ;
use ;
# async
Advanced Algorithms (Phase 4)
Community detection and anti-pattern matching:
use ;
#
Features
Phase 1 + 2 Complete ✅ (CPU Algorithms)
- CSR Graph Storage: Compressed Sparse Row format for efficient neighbor queries
- Parquet Persistence: DuckDB-inspired columnar storage for graphs
- Graph Algorithms: BFS, PageRank, find_callers (reverse BFS)
- Reverse CSR: O(1) incoming neighbor queries (bidirectional traversal)
- Academic Foundation: 10 peer-reviewed papers validate design choices
- Quality-First: EXTREME TDD, ≥95% coverage (98%+), zero SATD
Phase 3 Complete ✅ (GPU Acceleration)
- GPU BFS: Level-synchronous breadth-first search via WGSL compute shaders
- GPU PageRank: Sparse matrix-vector multiplication (SpMV) for power iteration
- Zero-Copy Transfers: CSR data uploaded to GPU VRAM via wgpu
- Async Operations: Non-blocking GPU compute with futures-intrusive
- Performance Validated: Benchmarks show 10-250× speedups vs NetworkX baseline
- Optional Feature: GPU support via
gpufeature flag (requires wgpu-capable hardware)
Phase 4 Complete ✅ (Advanced Algorithms)
- Louvain Clustering: Community detection for code module identification
- Subgraph Pattern Matching: Find anti-patterns and code smells (God Class, Circular Dependencies, Dead Code)
- Quality Enforcement: Zero SATD, zero clippy warnings, 98%+ test coverage
Phase 5 Complete ✅ (GPU Memory Paging)
- VRAM Detection: Automatic GPU memory limit detection
- Graph Tiling: Morsel-based chunking (128MB tiles)
- LRU Cache: Tile caching with least-recently-used eviction
- Paged BFS: BFS algorithm for graphs larger than VRAM
- Paging Coordinator: Manages tile lifecycle and memory limits
Performance
CPU Performance (Phase 2)
| Operation | Graph Size | Time | vs NetworkX |
|---|---|---|---|
| CSR Construction | 1K nodes | ~100μs | N/A |
| CSR Construction | 5K nodes | ~500μs | N/A |
| BFS Traversal | 1K nodes | ~40μs | 33× faster |
| PageRank (20 iter) | 1K nodes | ~500μs | 96× faster |
Run CPU benchmarks: cargo bench --bench graph_algorithms
GPU Performance (Phase 3) ⚡
GPU acceleration requires wgpu-capable hardware. Performance validated against NetworkX baseline:
| Operation | Graph Size | CPU Time | GPU Time | Speedup |
|---|---|---|---|---|
| GPU BFS | 1K nodes | ~1.2ms | ~50μs | 24× |
| GPU BFS | 5K nodes | ~6ms | ~200μs | 30× |
| GPU PageRank | 1K nodes | ~15ms | ~500μs | 30× |
Run GPU benchmarks: cargo bench --bench gpu_algorithms --features gpu
Note: GPU benchmarks are automatically skipped if no GPU is available.
Documentation
- Specification: docs/specifications/graph-db-spec.md (10 peer-reviewed citations)
- API Docs: Run
cargo doc --open - Quality: PMAT + certeza + bashrs enforcement
- GPU Integration Guide: GPU_BFS_STATUS.md (implementation details)
Installation
Dependencies
CPU-only (default):
[]
= "0.1"
With GPU acceleration:
[]
= { = "0.1", = ["gpu"] }
GPU feature requires:
- wgpu 22+ (WebGPU API)
- GPU with compute shader support (Vulkan, Metal, or DX12)
- Async runtime (tokio 1+)
Quality Enforcement
# Run all quality gates
# GPU-specific
# Development
Architecture
Data Model
- CSR (Forward): Outgoing edges indexed by node
- Reverse CSR: Incoming edges indexed by node (Phase 3.1)
- GPU Buffers: CSR data uploaded to VRAM for GPU algorithms
Algorithms
CPU (aprender integration):
- BFS: Frontier-based traversal
- PageRank: Sparse matrix operations via aprender
- find_callers: Reverse BFS using reverse CSR
GPU (wgpu compute shaders):
- GPU BFS: Level-synchronous WGSL shader (bfs_simple.wgsl)
- GPU PageRank: SpMV iteration with buffer ping-pong (pagerank.wgsl)
Academic Foundation
Based on research from:
- Gunrock (Wang et al., ACM ToPC 2017) - GPU graph traversal primitives
- GraphBLAST (Yang et al., ACM ToMS 2022) - GPU linear algebra for graphs
- DuckDB (Raasveldt et al., SIGMOD 2019) - Columnar storage patterns
- Page et al. (1999) - PageRank algorithm
- Ligra (Shun & Blelloch, PPoPP 2013) - Frontier-based traversal
See graph-db-spec.md for full citation list.
License
MIT License - Built by Pragmatic AI Labs