tsink
tsink is a lightweight, in-process time-series database engine for Rust applications.
It stores time-series data in compressed chunks, persists immutable segment files to disk, and can replay a write-ahead log (WAL) after crashes — all without requiring an external server.
Features
- Embedded API — no external server, network protocol, or daemon required.
- Thread-safe — the storage handle is an
Arc<dyn Storage>, safe to share across threads. - Multi-series model — series identity is metric name + exact label set.
- Typed values —
f64,i64,u64,bool,bytes, andstring. - Rich queries — downsampling, aggregation (12 built-in functions), pagination, and custom bytes aggregation via the
Codec/Aggregatortraits. - Disk persistence — immutable segment files with a crash-safe commit protocol.
- WAL durability — selectable sync mode (
PeriodicorPerAppend) with idempotent replay on recovery. - Out-of-order writes — data is returned sorted by timestamp regardless of insertion order.
- Concurrent writers — multiple threads can insert simultaneously with sharded internal locking.
- Optional PromQL engine — instant and range queries with 20+ built-in functions; enable with the
promqlCargo feature. - LSM-style compaction — tiered L0 → L1 → L2 segment compaction reduces read amplification.
- Gorilla compression — Gorilla XOR encoding for floats, delta-of-delta for timestamps, and per-type codecs for other value types.
- cgroup-aware defaults — worker thread counts and memory limits respect container CPU/memory quotas.
- Resource limits — configurable memory budget, series cardinality cap, and WAL size limit with admission backpressure.
Table of Contents
- Installation
- Quick Start
- Async Usage
- Server Mode
- Query APIs
- Series Discovery
- Value Model
- Label Constraints
- PromQL Engine
- Persistence and WAL
- On-Disk Layout
- Compression and Encoding
- Performance
- Architecture
- StorageBuilder Options
- Resource Limits and Backpressure
- Container Support
- Error Handling
- Advanced Usage
- Examples
- Benchmarks and Tests
- Development Scripts
- Project Structure
- Contributing
- Minimum Supported Rust Version
- License
Installation
[]
= "0.8.0"
Enable PromQL support:
[]
= { = "0.8.0", = ["promql"] }
Enable async storage facade (dedicated worker threads, runtime-agnostic futures):
[]
= { = "0.8.0", = ["async-storage"] }
Quick Start
use Error;
use ;
Async Usage
async-storage exposes AsyncStorage and AsyncStorageBuilder.
The async API routes requests through bounded queues to dedicated worker threads, while reusing the existing synchronous engine implementation. It is runtime-agnostic — no dependency on tokio, async-std, or any specific executor.
use ;
# async
AsyncStorage also provides synchronous accessors for introspection:
| Method | Description |
|---|---|
memory_used() |
Current in-memory usage in bytes. |
memory_budget() |
Configured memory budget. |
inner() |
Access the underlying Arc<dyn Storage>. |
into_inner(self) |
Unwrap the underlying storage handle. |
Server Mode (Prometheus Wire Compatible)
Experimental: tsink-server is still experimental and under development.
This workspace includes a binary crate at crates/tsink-server that runs tsink as an async network service (tokio-based) with Prometheus remote storage wire format, PromQL HTTP API, TLS, and Bearer token authentication.
Run the server:
CLI Options
| Flag | Default | Description |
|---|---|---|
--listen <ADDR> |
127.0.0.1:9201 |
Bind address. |
--data-path <PATH> |
None (in-memory) | Persist tsink data under PATH. |
--wal-enabled <BOOL> |
true |
Enable WAL. |
--no-wal |
— | Disable WAL (shorthand). |
--timestamp-precision <s|ms|us|ns> |
ms |
Timestamp precision (server defaults to milliseconds). |
--retention <DURATION> |
14d | Data retention period (e.g. 14d, 720h). |
--memory-limit <BYTES> |
Unlimited | Memory budget (e.g. 1G, 1073741824). |
--cardinality-limit <N> |
Unlimited | Max unique series. |
--chunk-points <N> |
2048 | Target points per chunk. |
--max-writers <N> |
Available CPUs | Concurrent writer threads. |
--wal-sync-mode <MODE> |
periodic |
WAL fsync policy (per-append or periodic). |
--tls-cert <PATH> |
— | TLS certificate file (PEM). Requires --tls-key. |
--tls-key <PATH> |
— | TLS private key file (PEM). Requires --tls-cert. |
--auth-token <TOKEN> |
— | Require Bearer token on all endpoints except health probes. |
Endpoints
| Method | Path | Description |
|---|---|---|
| GET | /healthz |
Health check (returns ok). |
| GET | /ready |
Readiness probe (returns ready). |
| GET | /metrics |
Self-monitoring metrics (Prometheus exposition format). |
| GET/POST | /api/v1/query |
PromQL instant query. |
| GET/POST | /api/v1/query_range |
PromQL range query. |
| GET | /api/v1/series |
Series metadata (accepts match[] selectors). |
| GET | /api/v1/labels |
All label names. |
| GET | /api/v1/label/<name>/values |
Values for a given label. |
| POST | /api/v1/write |
Prometheus remote write (protobuf + snappy). |
| POST | /api/v1/read |
Prometheus remote read (protobuf + snappy). |
| POST | /api/v1/import/prometheus |
Prometheus text exposition format ingestion. |
| GET | /api/v1/status/tsdb |
TSDB stats (JSON). |
| POST | /api/v1/admin/delete_series |
Delete series (stub, returns 501). |
TLS
Provide both --tls-cert and --tls-key to enable TLS:
Authentication
When --auth-token is set, all requests except GET /healthz and GET /ready must include the header Authorization: Bearer <TOKEN>. Unauthenticated requests receive a 401 Unauthorized response.
Graceful Shutdown
The server handles SIGTERM and SIGINT signals. On receipt it stops accepting new connections, waits up to 10 seconds for in-flight requests to complete, then closes storage cleanly.
PromQL HTTP API
The query endpoints follow the Prometheus HTTP API response format:
# Instant query
# Range query
Prometheus Integration
remote_write:
- url: http://127.0.0.1:9201/api/v1/write
remote_read:
- url: http://127.0.0.1:9201/api/v1/read
Text Format Ingestion
Post Prometheus exposition format text directly:
Query APIs
| Method | Description |
|---|---|
select(metric, labels, start, end) |
Returns points sorted by timestamp for one series. |
select_into(metric, labels, start, end, &mut buf) |
Same as select, but writes into a caller-provided buffer for allocation reuse. |
select_all(metric, start, end) |
Returns grouped results for all label sets of a metric. |
select_with_options(metric, QueryOptions) |
Supports downsampling, aggregation, custom bytes aggregation, and pagination. |
list_metrics() |
Lists all known metric + label-set series. |
list_metrics_with_wal() |
Like list_metrics, but also includes series only present in the WAL. |
select_series(SeriesSelection) |
Matcher-based series discovery (=, !=, =~, !~) with optional time-window filtering. |
All time ranges are half-open: [start, end).
Downsampling and Aggregation
use ;
let storage = new.build?;
storage.insert_rows?;
let opts = new
.with_downsample
.with_pagination;
let buckets = storage.select_with_options?;
assert_eq!;
Built-in aggregation functions:
None, Sum, Min, Max, Avg, First, Last, Count, Median, Range, Variance, StdDev.
Custom Bytes Aggregation
For non-numeric data, implement the Codec and Aggregator traits to define custom aggregation logic over bytes-encoded values:
use ;
;
;
let opts = new
.with_custom_bytes_aggregation;
Series Discovery
Use select_series with matcher-based filtering to discover series dynamically:
use ;
let selection = new
.with_metric
.with_matcher
.with_matcher
.with_time_range;
let series = storage.select_series?;
Supported matcher operators:
| Operator | Constructor | Description |
|---|---|---|
= |
SeriesMatcher::equal(name, value) |
Exact label match. |
!= |
SeriesMatcher::not_equal(name, value) |
Negated exact match. |
=~ |
SeriesMatcher::regex_match(name, pattern) |
Regex label match. |
!~ |
SeriesMatcher::regex_no_match(name, pattern) |
Negated regex match. |
Value Model
DataPoint stores a timestamp: i64 and a value: Value.
| Variant | Rust type |
|---|---|
Value::F64(f64) |
f64 |
Value::I64(i64) |
i64 |
Value::U64(u64) |
u64 |
Value::Bool(bool) |
bool |
Value::Bytes(Vec<u8>) |
raw bytes |
Value::String(String) |
UTF-8 string |
Notes:
- A series (same metric + labels) must keep a consistent value type family.
bytesandstringdata uses blob-lane encoding on disk.- Convenience conversions are provided:
DataPoint::new(ts, 42.5)auto-converts viaInto<Value>. - Accessor methods:
value.as_f64(),value.as_i64(),value.as_u64(),value.as_bool(),value.as_bytes(),value.as_str(). value.kind()returns a&'static strtag:"f64","i64","u64","bool","bytes", or"string".Value::F64(NAN)compares equal to itself, unlike standardf64, for consistent equality semantics in collections.
Automatic From conversions are provided for: f64, i64, i32, u64, u32, usize, bool, Vec<u8>, &[u8], String, and &str.
Label Constraints
Labels are key-value pairs that identify a series alongside the metric name. Labels are automatically sorted for consistent series identity — insertion order does not matter.
| Constraint | Limit |
|---|---|
| Label name length | 256 bytes |
| Label value length | 16,384 bytes (16 KB) |
| Metric name length | 65,535 bytes |
Empty label names or values are rejected. Oversized values are truncated at the marshaling boundary.
PromQL Engine
Enable with the promql feature. The engine supports instant and range queries over data stored in tsink.
use Arc;
use ;
use Engine;
let storage = new.build?;
storage.insert_rows?;
let engine = new;
// Instant query — evaluates at a single point in time.
let result = engine.instant_query?;
// Range query — evaluates at each step across a time window.
let result = engine.range_query?;
Use Engine::with_precision(storage, precision) if your timestamps are not in nanoseconds.
Supported functions:
| Category | Functions |
|---|---|
| Rate/counter | rate, irate, increase |
| Over-time | avg_over_time, sum_over_time, min_over_time, max_over_time, count_over_time |
| Math | abs, ceil, floor, round, clamp, clamp_min, clamp_max |
| Type conversion | scalar, vector |
| Time | time, timestamp |
| Sorting | sort, sort_desc |
| Label manipulation | label_replace, label_join |
Aggregation operators: sum, avg, min, max, count, topk, bottomk — with by/without grouping.
Binary operators: +, -, *, /, %, ^, ==, !=, <, >, <=, >=, and, or, unless — with on/ignoring vector matching and bool modifier.
Persistence and WAL
Set with_data_path(...) to enable persistence:
use Duration;
use ;
let storage = new
.with_data_path
.with_chunk_points
.with_wal_enabled
.with_wal_sync_mode
.build?;
Behavior:
close()flushes active chunks and writes immutable segment files.- With WAL enabled, reopening the same path replays durable WAL frames automatically.
- Recovery is idempotent — a high-water mark prevents double-apply of WAL frames.
Sync Modes
| Mode | Trade-off |
|---|---|
WalSyncMode::Periodic(Duration) |
Lower fsync overhead; small recent-write loss window on crash. |
WalSyncMode::PerAppend |
Strongest durability for acknowledged writes; higher fsync cost. |
On-Disk Layout
When persistence is enabled, tsink writes separate numeric/blob lane segment families:
<data_path>/
lane_numeric/
segments/
L0/...
L1/...
L2/...
lane_blob/
segments/
L0/...
L1/...
L2/...
wal/
wal-0000000000000000.log
wal-0000000000000001.log
...
Each segment directory contains:
manifest.bin, chunks.bin, chunk_index.bin, series.bin, postings.bin.
The storage format uses CRC32c and XXH64 checksums for corruption detection and a crash-safe commit protocol (write temps, fsync, rename atomically).
Compaction
tsink uses tiered LSM-style compaction across three levels:
| Level | Trigger | Description |
|---|---|---|
| L0 | Every flush | Newly flushed segments land here. |
| L1 | 4 L0 segments | L0 segments are merged and re-chunked into L1. |
| L2 | 4 L1 segments | L1 segments are merged into larger L2 segments. |
Compaction runs automatically in the background and is transparent to reads and writes.
Compression and Encoding
tsink uses two parallel encoding lanes based on value type:
Numeric Lane
Timestamps and numeric values (f64, i64, u64, bool) are encoded with specialized codecs. The encoder tries all applicable candidates and picks the most compact.
Timestamp codecs:
| Codec | Strategy |
|---|---|
| Fixed-step RLE | Run-length encoding for fixed-interval timestamps. |
| Delta-of-delta bitpack | Delta-of-delta encoding with bit-packing (primary strategy). |
| Delta varint | Varint-encoded deltas for irregular intervals. |
Value codecs:
| Codec | Type | Strategy |
|---|---|---|
| Gorilla XOR | f64 |
Gorilla-style XOR of IEEE 754 floats. |
| Zigzag delta bitpack | i64 |
Zigzag encoding + delta + bit-packing. |
| Delta bitpack | u64 |
Delta encoding + bit-packing. |
| Constant RLE | any numeric | Run-length encoding for constant values. |
| Bool bitpack | bool |
Bit-level packing (1 bit per value). |
Blob Lane
bytes and string values are encoded with delta block compression in a separate blob lane.
Performance
Compression
The adaptive codec selection (Gorilla XOR, delta-of-delta, RLE, bitpacking) achieves ~0.68 bytes per data point for typical f64 time-series workloads — down from 16 bytes uncompressed (8-byte timestamp + 8-byte value), a ~23x compression ratio.
Throughput
Insert throughput (single-series, in-memory):
| Batch size | Latency | Throughput |
|---|---|---|
| 1 | ~1.7 us | ~577K points/sec |
| 10 | ~5.3 us | ~1.89M points/sec |
| 1,000 | ~155 us | ~6.4M points/sec |
Select throughput (single-series, in-memory):
| Result size | Latency | Throughput |
|---|---|---|
| 1 point | ~114 ns | ~8.8M queries/sec |
| 10 points | ~296 ns | ~33.6M points/sec |
| 1,000 points | ~15.4 us | ~64M points/sec |
| 1,000,000 points | ~20.9 ms | ~48M points/sec |
Numbers above are ballpark figures from a single run (--quick mode). Run benchmarks on your hardware:
Architecture
┌─────────────────────────────────────────────────────┐
│ Public API │
│ StorageBuilder / Storage / AsyncStorage / PromQL │
├────────────┬──────────────┬─────────────────────────┤
│ Writers │ Readers │ Compactor │
│ (N threads)│ (concurrent)│ (background merges) │
├────────────┴──────────────┴─────────────────────────┤
│ Engine (partitioned by time) │
│ ┌──────────┐ ┌──────────┐ ┌──────────┐ │
│ │ Active │ │ Immutable│ │ Segments │ │
│ │ Chunks │→ │ Chunks │→ │ (L0/L1/ │ │
│ │ (memory) │ │ (memory) │ │ L2 disk)│ │
│ └──────────┘ └──────────┘ └──────────┘ │
├─────────────────────────────────────────────────────┤
│ WAL (write-ahead log) │ Series Registry + Index │
└─────────────────────────┴───────────────────────────┘
Key internals:
- Time partitions split data by wall-clock intervals (default: 1 hour).
- Chunks group data points (default: 2048 per chunk) with delta-of-delta timestamp encoding and per-lane value encoding (numeric vs. blob).
- Series registry maps metric name + label set → series ID, with inverted postings for label-based lookups.
- Segment files are immutable, CRC32c + XXH64 checksummed, and consist of:
manifest.bin,chunks.bin,chunk_index.bin,series.bin,postings.bin. - Sharded locking (64 internal shards) reduces write contention under high concurrency.
- Background flush periodically converts active chunks into immutable chunks (default: every 250ms).
- Background compaction merges segments across levels (default: every 5s).
StorageBuilder Options
| Method | Default | Description |
|---|---|---|
with_data_path(path) |
None (in-memory only) |
Directory for segment files and WAL. |
with_chunk_points(n) |
2048 |
Target data points per chunk before flushing. |
with_wal_enabled(bool) |
true |
Enable/disable write-ahead logging. |
with_wal_sync_mode(mode) |
Periodic(1s) |
WAL fsync policy. |
with_wal_size_limit(bytes) |
Unlimited | Hard cap on total WAL bytes across all WAL segments. |
with_wal_buffer_size(n) |
4096 | WAL buffer size in bytes. |
with_retention(duration) |
14 days | Data retention window. |
with_retention_enforced(bool) |
true |
Enforce retention window (false keeps data forever). |
with_timestamp_precision(p) |
Nanoseconds |
Timestamp unit (Seconds, Milliseconds, Microseconds, Nanoseconds). |
with_max_writers(n) |
Available CPUs (cgroup-aware) | Maximum concurrent writer threads. |
with_write_timeout(duration) |
30s | Timeout for write operations. |
with_partition_duration(duration) |
1 hour | Time partition granularity. |
with_memory_limit(bytes) |
Unlimited | Hard in-memory budget with admission backpressure before writes. |
with_cardinality_limit(series) |
Unlimited | Hard cap on total metric+label series cardinality. |
Resource Limits and Backpressure
tsink provides three configurable resource limits that protect against unbounded growth:
Memory Limit
let storage = new
.with_memory_limit // 512 MB
.build?;
When the memory budget is reached, new writes block until a background flush frees memory. This provides admission backpressure rather than OOM crashes.
Cardinality Limit
let storage = new
.with_cardinality_limit
.build?;
Caps the total number of unique metric + label-set combinations. Writes that would create new series beyond this limit return TsinkError::CardinalityLimitExceeded.
WAL Size Limit
let storage = new
.with_wal_size_limit // 1 GB
.build?;
Caps the total WAL bytes on disk. Writes that would exceed this limit return TsinkError::WalSizeLimitExceeded.
Container Support
tsink automatically detects cgroup v1/v2 CPU and memory quotas when running inside containers (Docker, Kubernetes, etc.). This affects:
- Writer thread count — defaults to available CPUs within the cgroup quota, not the host CPU count.
- Rayon thread pool — sized to respect container limits.
Override with the TSINK_MAX_CPUS environment variable:
TSINK_MAX_CPUS=4
Error Handling
All fallible operations return tsink::Result<T>, which wraps TsinkError. Key error variants:
| Error | Cause |
|---|---|
InvalidTimeRange |
start >= end in a query. |
WriteTimeout |
Writer could not acquire a slot within the configured timeout. |
MemoryBudgetExceeded |
Write blocked and memory budget was not freed in time. |
CardinalityLimitExceeded |
Too many unique series. |
WalSizeLimitExceeded |
WAL disk usage reached the configured cap. |
ValueTypeMismatch |
Inserting a different value type into an existing series. |
OutOfRetention |
Data point timestamp is outside the retention window. |
DataCorruption |
Checksum mismatch during segment read. |
StorageClosed |
Operation attempted after close() was called. |
Advanced Usage
Concurrent Operations
The storage handle is Arc-based and safe to share across threads:
use Arc;
use thread;
use ;
let storage = new
.with_max_writers
.build?;
let mut handles = vec!;
for worker_id in 0..8
for handle in handles
Out-of-Order Writes
tsink accepts data points in any order and returns them sorted by timestamp on read:
use ;
storage.insert_rows?;
let points = storage.select?;
// points are returned in chronological order: 1.0, 2.0, 3.0, 5.0
assert!;
WAL Recovery
After a crash, tsink automatically replays the WAL on the next open:
use StorageBuilder;
// First run — data is written and WAL-protected
let storage = new
.with_data_path
.build?;
storage.insert_rows?;
// Crash happens here — close() was never called
// Next run — recovery is automatic
let storage = new
.with_data_path // Same path
.build?; // WAL replay happens here
// Previously inserted data is available
let points = storage.select?;
Recovery is idempotent — a high-water mark ensures WAL frames are never applied twice.
Multi-Dimensional Label Querying
use ;
storage.insert_rows?;
// Query all label combinations for a metric
let all_results = storage.select_all?;
for in all_results
// Discover all known series
let all_series = storage.list_metrics?;
for series in all_series
Production Configuration
use Duration;
use ;
let storage = new
.with_data_path
.with_timestamp_precision
.with_retention // 30 days
.with_partition_duration // 6-hour partitions
.with_chunk_points
.with_max_writers
.with_write_timeout
.with_memory_limit // 1 GB
.with_cardinality_limit
.with_wal_sync_mode
.with_wal_buffer_size // 16 KB
.build?;
Examples
| Example | Description |
|---|---|
basic_usage |
Simple insert and select with labels. |
persistent_storage |
Disk persistence and WAL recovery. |
production_example |
Resource limits, query options, and custom aggregation. |
comprehensive |
Multiple features: concurrent ops, retention, downsampling, and aggregation. |
Benchmarks and Tests
Benchmark Suites
| Benchmark | Description |
|---|---|
storage_benchmarks |
Criterion-based matrix of insert/select operations at various scales. |
workload |
Realistic workload simulation with multiple series, out-of-order writes, and bytes-per-point measurement. |
Development Scripts
| Script | Description |
|---|---|
scripts/measure_perf.sh <quick|full> |
Run Criterion benchmarks with quick or full sample sizes. |
scripts/measure_bpp.sh <quick|full> |
Measure bytes-per-point compression efficiency. |
scripts/check_bench_regression.sh [threshold] |
Detect Criterion benchmark regressions beyond a threshold (default: 50%). |
The measure_bpp.sh script accepts environment variables for workload tuning:
| Variable | Description |
|---|---|
TSINK_ACTIVE_SERIES |
Number of concurrent series. |
TSINK_WARMUP_POINTS / TSINK_MEASURE_POINTS |
Points ingested during warmup and measurement phases. |
TSINK_BATCH_SIZE |
Insert batch size. |
TSINK_OOO_MAX_SECONDS / TSINK_OOO_PERMILLE |
Out-of-order write tuning. |
TSINK_RETENTION_SECONDS / TSINK_PARTITION_SECONDS |
Retention and partition windows. |
TSINK_FAIL_ON_TARGET |
Fail if compression target is not met. |
Project Structure
tsink/
├── src/
│ ├── lib.rs # Public API re-exports
│ ├── storage.rs # Storage trait and StorageBuilder
│ ├── value.rs # Value types, Codec, Aggregator traits
│ ├── label.rs # Label handling and marshaling
│ ├── error.rs # TsinkError and Result type
│ ├── wal.rs # WalSyncMode
│ ├── async_storage.rs # Async facade (feature-gated)
│ ├── cgroup.rs # Container resource detection
│ ├── concurrency.rs # Concurrency primitives
│ ├── mmap.rs # Memory-mapped I/O
│ ├── engine/
│ │ ├── engine.rs # Core storage engine
│ │ ├── compactor.rs # LSM compaction
│ │ ├── segment.rs # Segment file I/O
│ │ ├── chunk.rs # Chunk structures
│ │ ├── encoder.rs # Compression codecs
│ │ ├── query.rs # Query execution
│ │ ├── series_registry.rs # Series tracking and indexing
│ │ ├── index.rs # Index structures
│ │ └── wal.rs # WAL implementation
│ └── promql/ # PromQL engine (feature-gated)
│ ├── ast.rs # Abstract syntax tree
│ ├── lexer.rs # Tokenizer
│ ├── parser.rs # Parser
│ └── eval/ # Evaluation engine
├── crates/
│ └── tsink-server/ # Prometheus-compatible network server
├── tests/ # Integration tests
├── benches/ # Criterion benchmarks
├── examples/ # Usage examples
└── scripts/ # Development and CI scripts
Contributing
Contributions are welcome! Please feel free to submit a Pull Request.
Minimum Supported Rust Version
Rust 2021 edition. Tested on stable.
License
MIT