GraphRAG Core
The core library for GraphRAG-rs, providing portable functionality for both native and WASM deployments.
Overview
graphrag-core is the foundational library that powers GraphRAG-rs. It provides:
- Embedding Generation: 8 provider backends (HuggingFace, OpenAI, Voyage AI, Cohere, Jina, Mistral, Together AI, Ollama)
- Entity Extraction: TRUE LLM-based gleaning extraction with multi-round refinement (Microsoft GraphRAG-style)
- Graph Construction: Incremental updates, PageRank, community detection
- Retrieval Strategies: Vector, BM25, PageRank, hybrid, adaptive
- Configuration System: Comprehensive TOML-based configuration
- Cross-Platform: Works on native (Linux, macOS, Windows) and WASM
Quick Start
Installation
Add to your Cargo.toml:
[]
= { = "../graphrag-core", = ["huggingface-hub", "ureq"] }
Basic Usage
use HuggingFaceEmbeddings;
use EmbeddingProvider;
async
Features
Embedding Providers
GraphRAG Core supports 8 embedding backends via feature flags:
[]
# Free, offline embedding models
= ["hf-hub", "dirs"]
# API-based embedding providers
= ["dep:ureq"] # Enables OpenAI, Voyage, Cohere, Jina, Mistral, Together
# Local inference (coming soon)
= ["candle-core"]
Supported Providers:
| Provider | Cost | Quality | Feature Flag | Use Case |
|---|---|---|---|---|
| HuggingFace | Free | ★★★★ | huggingface-hub |
Offline, 100+ models |
| OpenAI | $0.13/1M | ★★★★★ | ureq |
Best quality |
| Voyage AI | Medium | ★★★★★ | ureq |
Anthropic recommended |
| Cohere | $0.10/1M | ★★★★ | ureq |
Multilingual (100+ langs) |
| Jina AI | $0.02/1M | ★★★★ | ureq |
Cost-optimized |
| Mistral | $0.10/1M | ★★★★ | ureq |
RAG-optimized |
| Together AI | $0.008/1M | ★★★★ | ureq |
Cheapest |
| Ollama | Free | ★★★★ | (via config) | Local GPU |
See EMBEDDINGS_CONFIG.md for detailed configuration.
Configuration System
GraphRAG Core uses TOML for configuration:
# config.toml
[]
= "huggingface"
= "sentence-transformers/all-MiniLM-L6-v2"
= 384
= 32
= "~/.cache/huggingface"
[]
= 10
= 0.8
[]
= 10
= "cosine"
Load configuration:
use Config;
let config = from_toml_file?;
Examples
Embedding Providers
# Demo all providers
# Config-based usage
# With API keys
OPENAI_API_KEY=sk-... \
Entity Extraction (Real LLM-Based Gleaning)
GraphRAG Core implements TRUE LLM-based entity extraction with iterative refinement:
use GleaningEntityExtractor;
use GleaningConfig;
use OllamaClient;
// Create Ollama client
let ollama_client = new;
// Configure gleaning (Microsoft GraphRAG-style)
let gleaning_config = GleaningConfig ;
// Create extractor
let extractor = new;
// Extract with real LLM calls (takes 15-30 seconds per chunk per round)
let = extractor.extract_with_gleaning.await?;
Performance Expectations:
- Small document (5-10 pages): 5-15 minutes
- Medium document (50-100 pages): 30-60 minutes
- Large document (500-1000 pages): 2-4 hours
This is REAL LLM processing with actual API calls, not pattern matching!
Graph Construction
use IncrementalGraph;
let mut graph = new;
graph.add_document?;
graph.add_document?; // Automatic deduplication!
Module Structure
graphrag-core/
├── src/
│ ├── embeddings/ # Embedding generation
│ │ ├── mod.rs # EmbeddingProvider trait
│ │ ├── huggingface.rs # HuggingFace Hub integration
│ │ ├── api_providers.rs # 6 API providers
│ │ ├── config.rs # TOML configuration
│ │ └── README.md # Technical documentation
│ ├── entity/ # Entity extraction
│ │ ├── mod.rs # Entity types and base traits
│ │ ├── prompts.rs # Microsoft GraphRAG-style LLM prompts
│ │ ├── llm_extractor.rs # Real LLM entity extraction with Ollama
│ │ ├── gleaning_extractor.rs # Multi-round gleaning orchestrator
│ │ └── semantic_merging.rs # Entity deduplication
│ ├── graph/ # Knowledge graph
│ │ ├── mod.rs # Graph construction
│ │ ├── incremental.rs # Incremental updates
│ │ └── pagerank.rs # Fast-GraphRAG retrieval
│ ├── retrieval/ # Search strategies
│ │ ├── mod.rs # Vector similarity
│ │ ├── bm25.rs # Keyword search
│ │ ├── hybrid.rs # Multi-strategy fusion
│ │ └── pagerank_retrieval.rs # Graph-based search
│ ├── config/ # Configuration system
│ │ ├── mod.rs # Main Config struct
│ │ └── toml_config.rs # TOML parsing
│ └── core/ # Core traits and types
│ ├── error.rs # Error types
│ ├── traits.rs # Pluggable abstractions
│ └── registry.rs # Component registry
└── examples/
├── embeddings_demo.rs # Test all providers
├── embeddings_from_config.rs # Config-based usage
└── embeddings.toml # Example configuration
Performance
Embedding Speed
| Provider | Latency | Throughput | GPU Support |
|---|---|---|---|
| HuggingFace (local) | 50-100ms | 10-20 chunks/s | ❌ |
| ONNX Runtime Web | 3-8ms | 125-333 chunks/s | ✅ WebGPU |
| Ollama | 100-200ms | 5-10 chunks/s | ✅ CUDA/Metal |
| API Providers | 50-200ms | Varies | ☁️ Cloud |
Memory Usage
- HuggingFace models: ~100-500MB (cached on disk)
- Graph (10K entities): ~50MB
- Embeddings cache: Configurable (default: 10K vectors)
Advanced Features
LightRAG (Dual-Level Retrieval)
[]
= "hybrid"
= true # 6000x token reduction!
PageRank (Fast-GraphRAG)
[]
= true # 27x performance boost
Intelligent Caching
[]
= true # 80%+ hit rate, 6x cost reduction
API Providers Setup
Environment Variables
# API Keys (recommended)
# HuggingFace cache
Configuration File
[]
= "openai"
= "text-embedding-3-small"
# api_key = "sk-..." # Or use OPENAI_API_KEY env var
= 1536
= 100
Testing
# Run all tests
# Test embeddings
# Test with model downloads (slow)
ENABLE_DOWNLOAD_TESTS=1
Documentation
- ENTITY_EXTRACTION.md - Complete guide to TRUE LLM-based gleaning extraction
- EMBEDDINGS_CONFIG.md - Complete embedding configuration guide
- src/embeddings/README.md - Technical embedding documentation
- ../REAL_LLM_GLEANING_IMPLEMENTATION.md - Implementation details and architecture
- ../HOW_IT_WORKS.md - Pipeline overview
- ../README.md - Main project documentation
Feature Flags
[]
# Embedding providers
= ["hf-hub", "dirs"] # Free, offline models
= ["dep:ureq"] # API providers
# Graph backends
= [] # In-memory (default)
= ["lancedb"] # LanceDB embedded
# Processing
= ["rayon"] # Multi-threading
= ["moka"] # LLM response cache
# Advanced features
= [] # Zero-downtime updates
= [] # Fast-GraphRAG
= [] # Dual-level retrieval
= [] # Query decomposition
# LLM integrations
= [] # Ollama local models
= [] # Function calling support
Cross-Platform Support
GraphRAG Core is designed to work across platforms:
- ✅ Linux - Full support with all features
- ✅ macOS - Full support with Metal GPU acceleration
- ✅ Windows - Full support with CUDA GPU acceleration
- ✅ WASM - Core functionality (see graphrag-wasm crate)
Contributing
See ../CONTRIBUTING.md for contribution guidelines.
License
MIT License - see ../LICENSE for details.
Part of the GraphRAG-rs project | Main README | Architecture | How It Works