ContRAG Rust - ICP Edition
Bring Retrieval-Augmented Generation (RAG) to Web3 with ICP Canisters
ContRAG Rust is a powerful library for building RAG systems directly on the Internet Computer Protocol (ICP). Store entity data on-chain, generate embeddings via HTTP outcalls, and build intelligent context for AI applicationsβall within your canister.
π Key Features
- On-Chain RAG: Build RAG systems entirely on ICP with stable memory storage
- External AI Integration: Use OpenAI, Gemini, or custom embedders via HTTP outcalls
- Flexible Data Sources: Read from canister state, stable memory, or inter-canister calls
- Web3-Native: Designed specifically for blockchain data and Web3 applications
- Zero Database Dependencies: No PostgreSQL, MongoDB, or external vector DBs required
- Developer-Friendly: Simple configuration with
.envfor secrets, JSON for schemas
π¦ Installation
Add to your canister's Cargo.toml:
[]
= { = "https://github.com/dhaniverse/contrag", = "main" }
= "0.13"
= "0.13"
= "0.10"
= { = "1.0", = ["derive"] }
= "1.0"
π― Quick Start
1. Define Your Entities
use *;
use ;
use Serialize;
2. Create Configuration
Create contrag.config.json:
Create .env (DO NOT commit):
OPENAI_API_KEY=sk-your-key-here
3. Build RAG Context
use *;
use *;
use OpenAIEmbedder;
use StableMemoryVectorStore;
async
4. Search Context
async
ποΈ Architecture
βββββββββββββββββββββββββββββββββββββββ
β Your ICP Canister (Rust) β
β β
β ββββββββββββββββββββββββββββββββ β
β β contrag-core Library β β
β β β β
β β β’ RagEntity Trait β β
β β β’ ContextBuilder β β
β β β’ StableMemoryVectorStore β β
β β β’ Entity Relationships β β
β ββββββββββββββββββββββββββββββββ β
βββββββββββββββββββββββββββββββββββββββ
β β
β β
βββββββββββββββββ ββββββββββββββββββββββ
β HTTP Outcalls β β Stable Memory β
β β β β
β β’ OpenAI β β β’ On-chain vectors β
β β’ Gemini β β β’ Persistent state β
β β’ Custom APIs β β β’ Upgrade-safe β
βββββββββββββββββ ββββββββββββββββββββββ
π¨ Core Concepts
RagEntity Trait
Implement this trait on your canister data structures to enable RAG:
Context Building
let builder = new;
// Single entity
let context = builder.build_entity_context;
// Entity with relationships
let related = vec!;
let graph_context = builder.build_graph_context;
// Chunk long text
let chunks = builder.chunk_text;
Vector Storage
let mut store = new;
// Store vector
store.store.await?;
// Search
let results = store.search.await?;
// Manage
store.count.await?;
store.delete_namespace.await?;
π§ Configuration
Entity Configuration
Embedder Configuration
OpenAI:
Gemini:
Chunking Configuration
π Advanced Features
Cached Embeddings
use ;
let embedder = new;
let mut cached = new; // Cache 1000 entries
let embeddings = cached.embed_with_cache.await?;
Inter-Canister Data Sources
use CanisterStateSource;
let source = new;
let user = source..await?;
Custom Similarity Metrics
use ;
let similarity = cosine_similarity;
let distance = euclidean_distance;
π Comparison: TypeScript vs Rust
| Feature | TypeScript ContRAG | Rust ContRAG (ICP) |
|---|---|---|
| Data Sources | PostgreSQL, MongoDB | Canister state, Stable memory |
| Vector Storage | Weaviate, pgvector | On-chain (stable memory) |
| Embedders | OpenAI, Gemini | OpenAI, Gemini (HTTP outcalls) |
| Configuration | Config file + .env | Config file + .env |
| Schema Introspection | β Automatic | β Manual (trait impl) |
| Runtime | Node.js | WASM (ICP canister) |
| Storage Cost | External DB fees | ~$5/GB/year on-chain |
| Query Speed | Fast (indexed DB) | Fast (in-memory) |
| Web3 Native | β | β |
π― Use Cases
- DeFi Personalization: Build user profiles from transaction history
- NFT Recommendations: Suggest NFTs based on user preferences and activity
- DAO Governance: Context-aware proposal summaries
- Web3 Social: Personalized content feeds from on-chain data
- Gaming: Player profiles and achievement context
- Identity Systems: Rich user context from verifiable credentials
π§ Limitations & Roadmap
Current Limitations
- β No automatic schema introspection (requires manual
RagEntityimpl) - β Simple cosine similarity (no advanced indexing like HNSW)
- β Single-canister vector storage (no distributed sharding yet)
- β οΈ HTTP outcall costs (cycles consumed per API call)
Roadmap
- Derive macro for
RagEntity(auto-implementation) - HNSW indexing for faster similarity search
- Multi-canister vector sharding
- IPFS/Arweave vector storage adapter
- Preference tracking (port from TypeScript)
- Cross-chain data sources (Ethereum, Bitcoin)
- Built-in embedding caching in stable memory
π Examples
See /examples/user-canister for a complete working example with:
- User and Order entities
- Relationship mapping
- Context building
- Vector storage and search
- Demo data seeding
π€ Contributing
Contributions welcome! This is an experimental project bringing RAG to Web3.
π License
MIT License - see LICENSE file
π Acknowledgments
- Based on ContRAG TypeScript
- Built for the Internet Computer Protocol
- Inspired by the vision of AI-powered Web3 applications
Ready to unlock RAG in Web3? π
Start building intelligent, context-aware canisters today!