timeseries-table-core
Core engine for a log-structured time-series table format. It owns table metadata,
append rules, coverage math, storage IO, and the TimeSeriesTable API that
higher-level integrations build on.
This crate does not implement a query engine. It exposes metadata and scan streams that other layers (DataFusion, Polars, custom code) can consume.
Layers (module layout)
metadata: pure metadata model + validation (logical schema, table metadata, segment types). No IO.transaction_log: append-only metadata log APIs (OCC) + table state materialization.table: user-facingTimeSeriesTableAPI (create/open/append/scan).storage: local backend + table-root IO helpers.coverage: coverage math and gap analysis.formats: format-specific helpers (currentlyformats::parquet).
During the refactor, older module paths remain available as compatibility
re-exports (for example, transaction_log, time_series_table, helpers).
Responsibilities
- Transaction log + metadata: versioned commits, optimistic concurrency, table schema.
- Segment metadata: min/max timestamps, row counts, file format, coverage sidecars.
- Coverage math: RoaringBitmap overlap checks and gap analysis in bucket space.
- Storage access: local filesystem backend and atomic IO helpers.
- User API: create/open/append/scan plus coverage/gap queries.
On-disk layout (local backend)
<table_root>/
_timeseries_log/
CURRENT # latest committed version
0000000001.json # commit files (LogAction list)
...
_coverage/
segments/<id>.roar # per-segment RoaringBitmap coverage
table/<ver>-<id>.roar # table snapshot coverage
data/... # Parquet segments (convention)
Transaction log and OCC
- Commits are append-only JSON files under
_timeseries_log/. CURRENTstores the latest committed version.- Commit flow is optimistic:
- Read current version
N. - Build a commit with
expected_version = N. - Write version
N+1only ifCURRENTis stillN.
- Read current version
- On conflict, the caller reloads and retries.
Log actions:
AddSegment: adds a new segment descriptor (andcoverage_pathif enabled).UpdateTableMeta: updates table-level metadata (schema adoption, entity pinning).UpdateTableCoverage: points to the latest table coverage snapshot.
Table lifecycle
Create
TimeSeriesTable::create validates TableMeta, writes the initial
UpdateTableMeta commit, then rebuilds TableState.
Open
TimeSeriesTable::open rebuilds TableState from the log and extracts
TimeIndexSpec. Empty logs return TableError::EmptyTable.
Append (Parquet)
- Read Parquet bytes from storage.
- Extract metadata and derive a
LogicalSchema. - If this is the first segment, adopt its schema; otherwise enforce exact match.
- Validate entity identity (if
TimeIndexSpec.entity_columnsis set). - Compute coverage, reject overlaps, and write coverage sidecars.
- Commit
AddSegment+ optionalUpdateTableMeta+UpdateTableCoverage.
Coverage and gaps
- Bucket IDs: timestamps are mapped to discrete bucket IDs using
TimeBucket. - Overlap checks: a new segment is rejected if any bucket ID already exists.
- Snapshots: table coverage snapshots are the union of all segment coverages.
- Recovery: if the snapshot sidecar is missing/corrupt, it is rebuilt from segment coverage sidecars when possible.
- Read-side metrics: coverage ratio, missing runs, max gap length, and "last fully covered window" are computed against the current snapshot.
Coverage sidecar lifecycle (v0.1)
- Write per-segment coverage to
_coverage/segments/<id>.roar. - Union with the current snapshot to build the next snapshot.
- Write snapshot to
_coverage/table/<ver>-<id>.roar. - Commit both references in a single log entry.
Range scans
- Select segments whose
[ts_min, ts_max]intersect[ts_start, ts_end). - Read each segment, build a Parquet
RecordBatchreader, and filter by time. - Stream filtered batches in chronological order as
TimeSeriesScan.
Schema rules (v0.1)
- No schema evolution: all segments must match the canonical schema exactly.
- Time column must exist and have a supported timestamp type.
- If entity columns are configured, each segment must be single-entity and match the pinned table identity.
Error behavior (high level)
- Missing coverage snapshot when segments exist yields a clear error.
- Overlaps surface a dedicated overlap error with context.
- Invalid time ranges return
TableError::InvalidRange. - Schema mismatches fail before any commit is attempted.
Extension points
- Storage: extend
TableLocationandstoragefor new backends. - Segment formats: extend
FileFormatandSegmentMeta::new_validated. - Schema evolution: rules live in
helpers::schema.