Please check the build logs for more information.
See Builds for ideas on how to fix a failed build, or Metadata for how to configure docs.rs builds.
If you believe this is docs.rs' fault, open an issue.
atelier-data
Market data infrastructure for the atelier-rs trading engine.
This crate provides everything needed to connect to cryptocurrency exchanges, normalise their heterogeneous WebSocket feeds into a common data model, synchronise events onto a uniform time grid, and persist the result to Apache Parquet files.
Core Data Types
Off-chain activity (market microstructure):
| Type | Description |
|---|---|
Orderbook |
Full-depth limit order book snapshot (bid/ask levels) |
OrderbookDelta |
Incremental order book maintained via NormalizedDelta updates |
Trade |
Public trade execution (price, size, side, timestamp) |
Liquidation |
Forced liquidation event |
FundingRate |
Perpetual futures funding rate observation |
OpenInterest |
Aggregate open interest snapshot |
Composed types:
| Type | Description |
|---|---|
MarketSnapshot |
Time-aligned bundle of all market data for one grid period |
MarketAggregate |
15-scalar feature vector derived from a MarketSnapshot |
Exchange Sources
| Source | Kind | API | Order Books | Public Trades | Liquidations | Funding Rates | Open Interest |
|---|---|---|---|---|---|---|---|
| Bybit | CEX | WSS | YES / YES | YES / YES | YES / YES | YES / YES | YES / YES |
| Coinbase | CEX | WSS | YES / YES | YES / YES | — | — | — |
| Kraken | CEX | WSS | YES / YES | YES / YES | — | — | — |
Format: Implemented / Tested. Dashes indicate the exchange does not expose the data type on its spot/linear WebSocket API.
Workers
Two worker types handle end-to-end data collection:
DataWorker — raw event ingestion without synchronisation. Connects to a
live exchange WebSocket feed, decodes events, and delivers them through a
pluggable OutputSink pipeline. Configuration is driven by a TOML manifest
(DataWorkerManifest). Handles reconnection, backoff, health monitoring,
and gap detection automatically.
MarketWorker — synchronised market snapshots. Extends DataWorker's
ingestion with a MarketSynchronizer that bins heterogeneous events onto
a uniform nanosecond grid, producing MarketSnapshot objects at each tick.
Multiple ClockMode strategies are supported: OrderbookDriven,
TradeDriven, LiquidationDriven, and ExternalClock. Snapshots are
delivered through the same OutputSink pipeline and can be flushed to
Parquet automatically.
Output Sinks
The OutputSink trait defines where worker output goes. Multiple sinks
run simultaneously via OutputSinkSet (fan-out):
| Sink | Status | Description |
|---|---|---|
ChannelSink |
Working | Wraps TopicRegistry broadcast channels for pub/sub |
TerminalSink |
Working | Debug/tracing terminal output |
ParquetSink |
Working | Buffers MarketSnapshots, decomposes and flushes to per-datatype Parquet files |
Parquet Persistence
Requires --features parquet. All five data types support read and write:
| Data Type | Writer | Reader |
|---|---|---|
| Orderbooks | write_ob_parquet |
read_ob_parquet |
| Trades | write_trades_parquet_timestamped |
read_trades_parquet |
| Liquidations | write_liquidations_parquet_timestamped |
read_liquidations_parquet |
| Funding Rates | write_funding_parquet_timestamped |
read_funding_parquet |
| Open Interest | write_oi_parquet_timestamped |
read_oi_parquet |
Filename Convention
All timestamped writers produce files following this pattern:
{SYMBOL}_{DATATYPE}_{MODE}_{TIMESTAMP}.parquet
Where MODE is "sync" for grid-aligned data or "raw" for unprocessed
captures. Symbols containing / (e.g. Kraken's BTC/USDT) are sanitised
to - in the filename (BTC-USDT) while the Parquet data retains the
original symbol string. Examples:
BTCUSDT_ob_sync_20260226_153000.123.parquet
ETHUSDT_trades_raw_20260226_160000.456.parquet
BTC-USDT_ob_sync_20260226_153000.123.parquet
Files are organised into subdirectories per data type: orderbooks/,
trades/, liquidations/, fundings/, open_interests/.
Feature Flags
| Flag | Effect |
|---|---|
parquet |
Enables Apache Parquet I/O (adds arrow + parquet deps) |
torch |
Enables tch-based tensor conversion in the datasets module |
Examples
| Example | Description | Command |
|---|---|---|
run_data_worker |
Raw event ingestion via DataWorker | cargo run -p atelier_data --example run_data_worker -- --config <path> |
run_market_worker |
Synchronised snapshots to Parquet via MarketWorker | cargo run -p atelier_data --example run_market_worker --features parquet -- --config <path> |
read_market_worker |
Read Parquet files and print per-symbol stats | cargo run -p atelier_data --example read_market_worker --features parquet -- --dir <path> |
bybit_markets |
Bybit market snapshot collection (standalone) | cargo run -p atelier_data --example bybit_markets --features parquet -- --config <path> |
coinbase_markets |
Coinbase market snapshot collection | cargo run -p atelier_data --example coinbase_markets --features parquet -- --config <path> |
kraken_markets |
Kraken market snapshot collection | cargo run -p atelier_data --example kraken_markets --features parquet -- --config <path> |
market_load |
Load and verify most recent Parquet files | cargo run -p atelier_data --example market_load --features parquet -- --config <path> |
market_fetch |
Multi-exchange raw stream collector (Bybit/Coinbase/Kraken) | cargo run -p atelier_data --example market_fetch --features parquet |
multi_sync_workers |
Multi-worker manifest parser (stub) | cargo run -p atelier_data --example multi_sync_workers -- --config <path> |
atelier-data is a member of the atelier-rs workspace:
Development resources: