brk_indexer
High-performance Bitcoin blockchain indexer with dual storage architecture
brk_indexer processes raw Bitcoin Core block data and creates efficient storage structures using both vectors (time-series) and key-value stores (lookups). It serves as the foundation of BRK's data pipeline, organizing all blockchain data into optimized formats for fast retrieval and analysis.
What it provides
- Dual Storage Architecture: Vectors for time-series data, key-value stores for lookups
- Memory Efficiency: ~5-6GB peak RAM usage during full blockchain indexing
- Incremental Processing: Resume from last indexed height with rollback protection
- Data Integrity: Collision detection and validation during indexing
- All Bitcoin Data Types: Complete support for blocks, transactions, inputs, outputs, and addresses
Key Features
Storage Strategy
Vector Storage (time-series data):
- Block metadata (height, timestamp, hash, difficulty, size)
- Transaction data (version, locktime, RBF flag, indices)
- Input/Output mappings and values
- Address bytes for all output types
- Efficient for range queries and analytics
Key-Value Storage (lookups):
- Block hash prefixes → heights
- Transaction ID prefixes → transaction indices
- Address byte hashes → type indices
- Fast point queries by hash or address
Performance Features
- Parallel Processing: Concurrent transaction and output processing using Rayon
- Batch Operations: Periodic commits every 1,000 blocks for optimal I/O
- Memory Efficiency: Optimized data structures minimize RAM usage
- Incremental Updates: Handles blockchain reorganizations automatically
Address Type Support
Complete support for all Bitcoin address types:
- P2PK (65-byte and 33-byte), P2PKH, P2SH
- P2WPKH, P2WSH, P2TR, P2A
- P2MS (multisig), OpReturn, Empty, Unknown
Usage
Basic Indexing
use Indexer;
use Parser;
use ;
use Exit;
// Setup Bitcoin Core RPC connection
let rpc = Boxleak;
// Create parser for Bitcoin Core block files
let parser = new;
// Create indexer with forced import (resets if needed)
let mut indexer = forced_import?;
// Setup graceful shutdown handler
let exit = new;
exit.set_ctrlc_handler;
// Index the blockchain
let indexes = indexer.index?;
println!;
Continuous Indexing
use ;
use sleep;
// Continuous indexing loop for real-time updates
loop
Accessing Indexed Data
// Access the underlying storage structures
let vecs = &indexer.vecs;
let stores = &indexer.stores;
// Get block hash at specific height
let block_hash = vecs.height_to_blockhash.get?;
// Look up transaction by prefix
let tx_prefix = from;
let tx_index = stores.txidprefix_to_txindex.get?;
// Get address data
let address_hash = from;
let type_index = stores.addressbyteshash_to_anyaddressindex.get?;
Performance Characteristics
Benchmarked on MacBook Pro M3 Pro (36GB RAM):
- Full blockchain sync (to ~892k blocks): 7-8 hours
- Peak memory usage: 5-6GB
- Storage overhead: ~27% of Bitcoin Core block size
- Incremental updates: Very fast, efficient resume from last height
Data Organization
The indexer creates this storage structure:
brk_data/
├── indexed/
│ ├── vecs/ # Vector storage
│ │ ├── height_to_* # Height-indexed data
│ │ ├── txindex_to_* # Transaction-indexed data
│ │ └── outputindex_to_* # Output-indexed data
│ └── stores/ # Key-value stores
│ ├── hash_lookups/ # Block/TX hash mappings
│ └── address_maps/ # Address type mappings
└── metadata/ # Versioning and state
Indexes Tracking
The indexer maintains current indices during processing:
Requirements
- Bitcoin Core node with RPC enabled
- Block file access to
~/.bitcoin/blocks/ - Storage space: Minimum 500GB (scales with blockchain growth)
- Memory: 8GB+ RAM recommended
- CPU: Multi-core recommended for parallel processing
Rollback and Recovery
- Automatic rollback on interruption or blockchain reorgs
- State persistence for efficient restart
- Version management for storage format compatibility
- Graceful shutdown with Ctrl+C handling
Dependencies
brk_parser- Bitcoin block parsing and sequential accessbrk_store- Key-value storage wrapper (fjall-based)vecdb- Vector database for time-series storagebitcoin- Bitcoin protocol types and parsingrayon- Parallel processing frameworkbitcoincore_rpc- Bitcoin Core RPC client
This README was generated by Claude Code