Crate jetstreamer

Crate jetstreamer 

Source
Expand description

Jetstreamer is a high-throughput Solana backfilling and research toolkit designed to stream historical chain data live over the network from Project Yellowstone’s Old Faithful archive, which is a comprehensive open source archive of all Solana blocks and transactions from genesis to the current tip of the chain. Given the right hardware and network connection, Jetstreamer can stream data at over 2.7M TPS to a local Jetstreamer plugin or geyser plugin. Higher speeds are possible with better hardware (in our case 64 core CPU, 30 Gbps+ network for the 2.7M TPS record).

§Components

  • firehose exposes the underlying streaming primitives and async helpers for downloading, compacting, and replaying Old Faithful CAR archives at scale.
  • plugin provides a trait-driven framework for building structured firehose data observers with ClickHouse-friendly batching and runtime metrics.
  • utils hosts shared helpers used across the Jetstreamer ecosystem.

All of these crates are re-exported from this facade, keeping most applications reliant on a single dependency.

§Quick Start

Install the CLI by cloning the repository and running the bundled demo runner:

# Replay all transactions in epoch 800 using eight HTTP multiplexing workers.
JETSTREAMER_THREADS=8 cargo run --release -- 800

# Or replay an explicit slot range (slot ranges may cross epoch boundaries).
JETSTREAMER_THREADS=8 cargo run --release -- 358560000:367631999

The CLI accepts either <start>:<end> slot ranges or a single epoch. See JetstreamerRunner::parse_cli_args for the precise argument grammar.

When JETSTREAMER_CLICKHOUSE_MODE is auto (the default) the runner inspects the DSN to decide whether to launch the bundled ClickHouse helper or connect to an external cluster. You can also manage that helper manually via the crate-level Cargo aliases:

cargo clickhouse-server
cargo clickhouse-client

cargo clickhouse-server launches the bundled binary in bin/, while cargo clickhouse-client opens a client session against the locally spawned helper. You can connect with the client while Jetstreamer is running, or re-launch the helper later to inspect the data persisted in bin/. Copying the bin/ directory between systems is a lightweight way to migrate ClickHouse state generated by the runner.

§Environment Variables

JetstreamerRunner honors several environment variables for runtime tuning:

  • JETSTREAMER_THREADS (default hardware auto-detect via jetstreamer_firehose::system::optimal_firehose_thread_count): number of firehose ingestion threads. Increase this to multiplex Old Faithful HTTP requests across more cores, or leave it unset to size the pool automatically using CPU and network heuristics.
  • JETSTREAMER_CLICKHOUSE_DSN (default http://localhost:8123): DSN passed to plugin instances that emit ClickHouse writes.
  • JETSTREAMER_CLICKHOUSE_MODE (default auto): controls ClickHouse integration. Accepted values are auto, remote, local, and off.

Additional firehose-specific knobs such as JETSTREAMER_COMPACT_INDEX_BASE_URL and JETSTREAMER_NETWORK live in jetstreamer_firehose.

§Limitations

While Jetstreamer is able to play back all blocks, transactions, epochs, and rewards in the history of Solana mainnet, it is limited by what is in Old Faithful. Old Faithful does not contain account updates, so Jetstreamer at the moment also does not have account updates or transaction logs, though we plan to eventually have a separate project that provides this, stay tuned!

It is worth noting that the way Old Faithful and thus Jetstreamer stores transactions, they are stored in their “already-executed” state as they originally appeared to Geyser when they were first executed. Thus while Jetstreamer can replay ledger data, it is not executing transactions directly, and when we say 2.7M TPS, we mean “2.7M transactions processed by a Jetstreamer or Geyser plugin locally, streamed over the internet from the Old Faithful archive.”

§Configuration

The following configuration ENV vars are available across the Jetstreamer ecosystem:

§JetstreamerRunner Config

VariableDefaultEffect
JETSTREAMER_CLICKHOUSE_DSNhttp://localhost:8123HTTP(S) DSN passed to the embedded plugin runner for ClickHouse writes. Override to target a remote ClickHouse deployment.
JETSTREAMER_CLICKHOUSE_MODEautoControls ClickHouse integration. auto enables output and spawns the helper only for local DSNs, remote enables output without spawning the helper, local always requests the helper, and off disables ClickHouse entirely.
JETSTREAMER_THREADSautoNumber of firehose ingestion threads. Leave unset to rely on hardware-based sizing or override with an explicit value when you know the ideal concurrency.

Helper spawning only occurs when both the mode allows it (auto/local) and the DSN points to localhost or 127.0.0.1.

§Firehose Config (also used by JetstreamerRunner)

VariableDefaultEffect
JETSTREAMER_COMPACT_INDEX_BASE_URLhttps://files.old-faithful.netBase URL for compact CAR index artifacts. Point this at your own mirror to reduce load on the public archive.
JETSTREAMER_NETWORKmainnetNetwork suffix appended to cache namespaces and index filenames (e.g., testnet).
JETSTREAMER_NETWORK_CAPACITY_MB1000Assumed network throughput in megabytes per second used when auto-sizing firehose thread counts.

Changing the network automatically segregates cache entries, allowing you to toggle between clusters without purging state.

§Epoch Feature Availability

Old Faithful snapshots expose different metadata across the network’s history. Use the table below to choose replay windows that match your requirements:

Epoch rangeSlot rangeComment
0–1560–?Incompatible with modern Geyser plugins
157+?Compatible with modern Geyser plugins
0–4490–194184610CU tracking not available (reported as 0)
450+194184611+CU tracking fully available

Epochs at or above 157 work with the bundled Geyser plugin interface, while compute unit accounting first appears at epoch 450.

Re-exports§

pub use jetstreamer_firehose as firehose;
pub use jetstreamer_plugin as plugin;
pub use jetstreamer_utils as utils;

Structs§

Config
Runtime configuration for JetstreamerRunner.
JetstreamerRunner
Coordinates plugin execution against the firehose.

Functions§

parse_cli_args
Parses command-line arguments and environment variables into a Config.