Crate blocks_iterator
source ·Expand description
Blocks iterator
Iterates over Bitcoin blocks, decoding data inside Bitcoin Core’s blocks directory.
Features:
- Blocks are returned in height order, it avoids following reorgs (see
Config::max_reorg
parameter) - Blocks come with extra data
BlockExtra
like all block’s previous outputs, it allows computing transactions fee or verifying scripts in the blockchain.
Iteration modes
In rust programs
Used as a library blocks could be iterated via the iter()
method like:
// "blocks" dir contains first 400 testnet blocks
let conf = blocks_iterator::Config::new("blocks", bitcoin::Network::Testnet);
let mut total_fee = 0u64;
for b in blocks_iterator::iter(conf) {
total_fee += b.fee().expect("fee available cause we are keeping prevouts");
}
// Only a bunch of tx with fee exists on testnet with height < 400
// in blocks: 385, 387, 389, 390, 392, 394
assert_eq!(total_fee, 450_000u64);
When the task to be performed is computational costly, like verifying spending conditions, it is
suggested to parallelize the execution like it’s done with rayon (or similar) in the
verify example
(note par_bridge()
call).
Through Pipes
Other than inside Rust programs, ordered blocks with previous outputs could be iterated using Unix pipes.
$ cargo build --release
$ cargo build --release --examples
$ ./target/release/blocks_iterator --blocks-dir ~/.bitcoin/testnet3/blocks --network testnet --max-reorg 40 --stop-at-height 200000 | ./target/release/examples/with_pipe
...
[2023-03-31T15:01:23Z INFO with_pipe] Max number of txs: 6287 block: 0000000000bc915505318327aa0f18568ce024702a024d7c4a3ecfe80a893d6c
[2023-03-31T15:01:23Z INFO with_pipe] total missing reward: 50065529986 in 100 blocks
[2023-03-31T15:01:23Z INFO with_pipe] most_output tx is 640e22b5ddee1f6d2d701e37877027221ba5b36027634a2e3c3ee1569b4aa179 with #outputs: 10001
If you have more consumer process you can concatenate pipes by passing stdout to PipeIterator::new
or using tee
utility to split the stdout of blocks_iterator. The latter is better because it doesn’t require re-serialization of the data.
Memory requirements and performance
Running (cargo run --release -- --network X --blocks-dir Y >/dev/null
) on threadripper 1950X,
Testnet @ 2130k, Mainnet @ 705k. Spinning disk. Take following benchmarks with a grain of salt
since they refer to older versions.
Network | --skip--prevout | --max-reorg | utxo-db | Memory | Time |
---|---|---|---|---|---|
Mainnet | true | 6 | no | 33MB | 1h:00m |
Mainnet | false | 6 | no | 5.3GB | 1h:29m |
Mainnet | false | 6 | 1 run | 201MB | 9h:42m |
Mainnet | false | 6 | 2 run | 113MB | 1h:05m |
Testnet | true | 40 | no | 123MB | 3m:03s |
Testnet | false | 40 | no | 1.4GB | 8m:02s |
Testnet | false | 40 | 1 run | 247MB | 16m:12s |
Testnet | false | 40 | 2 run | 221MB | 8m:32s |
Doc
To build docs:
RUSTDOCFLAGS="--cfg docsrs" cargo +nightly doc --all-features --open
Examples
Run examples with:
cargo run --release --example verify
- heaviest find the transaction with greatest weight
- most_output find the transaction with most output
- outputs_versions Count outputs witness version
- signatures_in_witness Count signatures in witness
- verify verify transactions in blocks using libbitcoin-consensus. Consumers are run in parallel fashion.
Version 1.0 meaning
The 1.0
is not to be intended as battle-tested production-ready library, the binary format of
BlockExtra
changed and I wanted to highlight it with major version rollout.
Similar projects
- bitcoin-iterate this project inspired blocks_iterator, the differences are:
- It is C-based
- It’s more suited for shell piping, while blocks_iterator can be used with piping but also as a rust library
- It doesn’t give previous outputs information
- It is making two passage over
blocks*.dat
serially, while blocks_iterator is doing two passes in parallel.
- rust-bitcoin-indexer this project requires longer setup times (about 9 hours indexing) and a postgre database, once the indexing is finished it allows fast queries on relational database.
- electrs specifically intended for the Electrum protocol
MSRV
Check minimum rust version run in CI, as of Aug 2023 is:
1.60.0
without features (needs some pinning, check CI)1.66.0
with features.
Re-exports
Structs
- The bitcoin block and additional metadata returned by the crate::iter() method
- Configuration parameters, most important the bitcoin blocks directory
- Before reorder we keep only the position of the block in the file system and data relative to the block hash, the previous hash and the following hash (populated during reorder phase) We will need to read the block from disk again, but by doing so we will avoid using too much memory in the
OutOfOrderBlocks
map. - Contains counter and instants to provide per period stats over transaction and blocks processed
- Utility used to return true after
period
- Iterator to use un Unix-style pipe composition when receiving BlockExtra from stdin and optionally propogating those to stdout
Enums
Functions
- Return an Iterator of
BlockExtra
read fromblocks*.dat
contained in theconfig.blocks_dir
Blocks returned are iterated in order, starting from the genesis to the highest block (minusconfig.max_reorg
) in the directory, unlessconfig.stop_at_height
is specified. - par_iterDeprecated
par_iter
is used when the task to be performed on the blockchain is more costly than iterating the blocks. For example verifying the spending conditions in the blockchain. likecrate::iter()
accepts configuration parameters via theConfig
struct. APREPROC
closure has to be provided, this process a single block and produce a Vec of user defined structDATA
. ATASK
closure accepts theDATA
struct and a sharedSTATE
and it is executed in a concurrent way. TheTASK
closure returns a bool which indicates if the execution should be terminated. Note that access toSTATE
inTASK
should be done carefully otherwise the contention would reduce the speed of execution. - periodic_log_levelDeprecatedUtility method usually returning log::Level::Debug but when
i
is divisible byevery
returns log::Level::Info