blockchain-zc-parser
A zero-copy, allocation-free parser for Bitcoin blockchain binary data written in Rust, designed for high-throughput indexers, analytics engines, and embedded environments.
Features
| Zero-copy | All parsed structures borrow &'a [u8] directly from the input — no memcpy, no String, no Vec. |
| No alloc | Compatible with #![no_std] targets. Use in embedded devices, WASM, kernel modules. |
| Streaming | BlockTxIter and TransactionParser process transactions lazily via closures — never load an entire block into structured memory. |
| Fast | Parsing an 80-byte block header requires only ~10 integer reads from a contiguous buffer. Block file iteration is a tight loop over magic bytes and size fields. |
| Safe | unsafe is used only inside cursor.rs for pointer arithmetic after explicit bounds checks. Every unsafe block is annotated. |
Supported formats
- Block headers (80 bytes, Bitcoin protocol)
- Legacy and SegWit (BIP 141) transactions
- Bitcoin script pattern matching:
P2PKH,P2SH,P2WPKH,P2WSH,P2TR,P2PK,OP_RETURN, bare multisig
blkNNNNN.datraw block files written by Bitcoin Core
Quick start
[]
= "0.1"
Parse a block header
use ;
Example: parse a raw block file
Download a raw block (example: Bitcoin genesis block):
Run the example parser:
Summary-only mode (no per-transaction printing):
Limit printed transactions:
Print a specific transaction index:
Raw block vs blkNNNNN.dat
There are two different binary formats you may encounter:
Raw block (.bin, RPC, mempool API)
This is the pure Bitcoin block payload:
[80-byte header]
[varint tx_count]
[transactions...]
It contains no magic bytes and no size prefix.
You typically obtain it via:
This format can be parsed directly with:
let = new?;
Bitcoin Core blkNNNNN.dat
Files in your local Bitcoin Core data directory:
~/.bitcoin/blocks/blk00000.dat
Each file contains multiple blocks, each prefixed by:
[4-byte magic][4-byte little-endian size][raw block]
To parse these files, use BlkFileIter:
use ;
let mut it = new;
while let Some = it.next_block?
Important
If you pass a blkNNNNN.dat file directly to BlockTxIter::new, parsing will fail
because the file contains magic bytes and size prefixes.
The parse_block example automatically detects and unwraps the first block
from a blkNNNNN.dat file if necessary.
Why zero-copy matters
Bitcoin blocks can exceed 1–2 MB and may contain thousands of transactions.
A traditional parser typically:
- Allocates
Vecs for inputs and outputs - Copies script bytes into owned buffers
- Builds full in-memory representations
blockchain-zc-parser avoids all of this.
Every parsed structure borrows directly from the original &[u8] buffer.
No heap allocations. No memcpy. No string building.
This has several practical consequences:
- High throughput (hundreds of MB/s on modern CPUs)
- Very low memory usage
- Suitable for streaming, indexers, and embedded environments
- Works naturally with memory-mapped files (
mmap)
For indexers and blockchain analytics pipelines, this allows processing entire block files with near-linear memory access patterns.
Stream transactions from a block
use ;
Iterate over a Bitcoin Core blkNNNNN.dat file
use ;
Architecture
src/
├── lib.rs — crate root, re-exports
├── cursor.rs — zero-copy Cursor<'a> over &[u8] ← start here
├── error.rs — ParseError enum, no_std compatible
├── hash.rs — Hash32<'a> / Hash20<'a> wrappers
├── script.rs — Script<'a>, ScriptType, instruction iterator
├── transaction.rs — TxInput, TxOutput, OutPoint, TransactionParser
└── block.rs — BlockHeader, BlockTxIter, BlkFileIter
The Cursor type is the single entry point for all parsing.
It advances a usize offset into a &'a [u8] and returns sub-slices with
lifetime 'a — identical to the original input. No unsafe code exists outside
this file.
Benchmarks
Run on an Apple M2 Pro (single-core, Rust stable 1.88 at time of measurement, --release):
| Benchmark | Throughput |
|---|---|
block_header/parse_80_bytes |
~1.1 GB/s |
transaction/parse/coinbase |
~860 MB/s |
transaction/parse/p2pkh_2out |
~740 MB/s |
block/streaming_iter/tx_count=1000 |
~695 MB/s |
Run yourself:
# HTML report: target/criterion/report/index.html
no_std usage
Disable the default feature set (which enables std):
[]
= { = "0.1", = false }
With default-features = false:
- All
std::error::Errorimpls are removed. BlockHeader::block_hash()(requires SHA-256) is removed — callsha2::Sha256directly onheader.raw.- Everything else works identically.
Minimum supported Rust version (MSRV)
Rust 1.88+ (edition 2021). The crate uses only stable Rust features.
Safety
The only unsafe code lives in src/cursor.rs:
// SAFETY: `end` was checked to be ≤ data.len() on the line above.
let slice = unsafe ;
All other code is safe Rust. The crate passes cargo miri test (run it yourself with cargo +nightly miri test).
Contributing
Pull requests are welcome. Please:
- Run:
cargo test
cargo clippy --all-targets --all-features -- -D warnings
- Add a unit test for any new parsing logic.
- Keep
unsafeblocks minimal and documented.
License
Licensed under either of:
- Apache-2.0 (LICENSE-APACHE)