skade

Skaði, the winter queen — she keeps the icebergs in order.

This repo is Skade: a fast, pure-Rust thin layer over iceberg-rust, backed by a ridiculously fast skade-katalog.

skade — the data plane: Arrow RecordBatch in, SQL out; gatling multi-core ingest, zero-copy recast, identity-partitioned + compressed writes, V3 tables. (skade/)
skade-katalog — the catalog: pure-Rust, redb-backed, single-file ACID iceberg::Catalog. No SQL, no C deps.

skade-katalog

The three Norns — Urðr (past), Verðandi (present), Skuld (future) — recording the world's icebergs in Catalogus Icebergorum / Glacius Type Index

A pure-Rust iceberg::Catalog with cross-table transactions. A lock-free static search tree (Ragnar STree64) out front, continuously regenerated from a redb ACID backend. In-process — no network hop, no JVM, no JSON round-trip.

table_exists — 2.0M ops/s, ~492× Nessie · ~677× Polaris (0.31 µs p50). Every core reads the front lock-free.
Ingest saturates the box — 100% CPU, fully multicore. All cores encode Parquet in parallel; a gatling no-barrier pool keeps the writes overlapping, even over S3.
ACID. Every mutation is one redb WriteTransaction. Atomic multi-table commits via RedbCatalog::atomic_release.

Full leaderboard (skade-katalog vs Nessie vs Polaris, every storage variant, all four capabilities) → Benchmarks.

table_exists — catalog reads (ops/sec, log scale · higher is better)

Status

The Iceberg Catalog trait is fully implemented and tested against iceberg = "0.9.1". One gap: schema evolution (the upstream Transaction actions aren't public until iceberg-rust 0.10). See Known shortcomings.

skade — the data-plane companion crate

This repo also ships skade ("winter queen" — Skaði): fast Iceberg table writing/reading + ergonomic DataFusion SQL on top of RedbCatalog. One directory = catalog (catalog.redb) + warehouse; Arrow RecordBatch in, SQL out (wh.sql("SELECT … FROM a JOIN b …")). It packages the writer stack, read-to-Arrow primitive, schema bridges (incl. the unsigned widen/unwiden reinterprets znippy needs), and the windowed-Parquet bulk-ingest helpers that were proven in bench/. Like bench/, it is a detached cargo workspace (skade/), so this crate's manifest and lockfile stay untouched. See skade/README.md.

Read & write paths

Every call resolves left-to-right through these layers, falling through only on a miss. Latest-state reads ride the L1 pointer mirror; time-travel (snapshot-id) reads ride the Ragnar STree64 index. redb is always the source of truth behind them — the in-memory layers are keyed by immutable, content-addressed identifiers, so they can only be evicted, never stale.

Read paths as a circuit board: each catalog function enters from the left and is wired through the in-memory cache layers — the L1 pointer mirror, L0 metadata cache, L1.5 handle cache, and the Ragnar STree64 snapshot index — falling through to the redb source-of-truth rail only on a miss.

layer	key → value	type	populated
L1 pointer mirror	`table_key → metadata_location`	`ArcSwap<imbl::HashMap<String, Arc<str>, foldhash>>`	full scan at open, then write-through
L1.5 handle cache	`metadata_location → Table`	`moka::Cache<String, Table>` (capacity-bounded)	on `load_table` miss
L0 metadata cache	`metadata_location → TableMetadata`	`moka::Cache<String, Arc<TableMetadata>>` (byte-bounded, single-flight)	on metadata miss
Ragnar static index	`snapshot_id → (table_key, metadata_location)`	`ArcSwapOption<STree64>`	warm-built at open, rebuilt by bg compactor (≥1024 commits)
redb source of truth	`tables` · `commits` · `namespaces` · `namespace_props` · `meta`	`Mutex<redb::Database>` (ACID)	every committed write

The same routing in full — including the time-travel, write, and redb-direct paths the diagram leaves out:

LATEST READS
  table_exists(id) ───────► L1 mirror ──hit──► true
                                └─miss─► redb `tables` ─────► bool

  resolve_metadata(id) ───► L1 mirror ──► loc ──► L0 cache ──hit──► Arc<TableMetadata>
       (skips Table::build)     └─miss─► redb         └─miss─► FileIO read + parse JSON

  load_table(id) ─────────► L1 mirror ──► loc ──► L1.5 cache ──hit──► Table  (~100 ns clone)
                                └─miss─► redb          └─miss─► L0 cache ──► Table::build() + insert

TIME-TRAVEL READS  (by snapshot_id: i64)
  load_table_at / resolve_metadata_at(id, sid)
        ──► Ragnar STree64 ──hit──► loc ──► L1.5 / L0 ──► Table / Arc<TableMetadata>
                 └─miss (sid above cutoff)─► redb `commits` live tail ──► loc ──► …

  resolve_many(id, [sid; N])
        ──► Ragnar batch probe (1 pipelined pass) ──► redb `commits` (1 txn, misses only) ──► L0 per loc

WRITES  (create · register · update · drop · rename)
  …──► FileIO write …/<uuid>.metadata.json ──► redb WriteTransaction { tables CAS · commits log · meta++ }
        ──commit──► L1 mirror write-through (insert / remove) ──► maybe rebuild Ragnar (background)

  update_table ──► group-commit: N concurrent commits coalesce into 1 redb txn / 1 fsync   ← commit-burst lever

REDB-DIRECT  (no cache layer)
  list_namespaces · get_namespace · namespace_exists · list_tables ──► redb read txn (range scan)
  create / update / drop_namespace ──────────────────────────────────► redb WriteTransaction

Benchmarks

All numbers below are auto-filled by nornir docs render from the latest nornir bench run (machine/cores/version in each header). Don't hand-edit inside the generated regions — re-run the bench instead. Reproduce with the harness in bench/ (cargo run --bin bench-containers -- up in bench/, then nornir bench run skade-katalog from workspace_skade/).

Catalog read RPC — `table_exists` (storage-free, runs on every backend)

table_exists is a pure catalog RPC (no object storage), the apples-to-apples comparable; Nessie/Polaris go through the same iceberg::Catalog REST client.