tsink
A high-performance embedded time-series database for Rust
Overview
tsink is a lightweight, high-performance time-series database engine written in Rust. It provides efficient storage and retrieval of time-series data with automatic compression, time-based partitioning, and thread-safe operations.
Key Features
- 🚀 High Performance: Gorilla compression achieves ~1.37 bytes per data point
- 🔒 Thread-Safe: Lock-free reads and concurrent writes with configurable worker pools
- 💾 Flexible Storage: Choose between in-memory or persistent disk storage
- 📊 Time Partitioning: Automatic data organization by configurable time ranges
- 🏷️ Label Support: Multi-dimensional metrics with key-value labels
- 📝 WAL Support: Write-ahead logging for durability and crash recovery
- 🗑️ Auto-Retention: Configurable automatic data expiration
- 🐳 Container-Aware: cgroup support for optimal resource usage in containers
- ⚡ Zero-Copy Reads: Memory-mapped files for efficient disk operations
Installation
Add tsink to your Cargo.toml:
[]
= "0.3.1"
Quick Start
Basic Usage
use ;
Persistent Storage
use ;
use Duration;
let storage = new
.with_data_path // Enable disk persistence
.with_partition_duration // 1-hour partitions
.with_retention // 7-day retention
.with_wal_buffer_size // 8KB WAL buffer
.build?;
Multi-Dimensional Metrics with Labels
use ;
// Create metrics with labels for detailed categorization
let rows = vec!;
storage.insert_rows?;
// Query specific label combinations
let points = storage.select?;
// Query all label combinations for a metric
let all_results = storage.select_all?;
for in all_results
Architecture
tsink uses a linear-order partition model that divides time-series data into time-bounded chunks:
┌─────────────────────────────────────────┐
│ tsink Storage │
├─────────────────────────────────────────┤
│ │
│ ┌───────────────┐ Active Partition │
│ │ Memory Part. │◄─ (Writable) │
│ └───────────────┘ │
│ │
│ ┌───────────────┐ Buffer Partition │
│ │ Memory Part. │◄─ (Out-of-order) │
│ └───────────────┘ │
│ │
│ ┌───────────────┐ │
│ │ Disk Part. 1 │◄─ Read-only │
│ └───────────────┘ (Memory-mapped) │
│ │
│ ┌───────────────┐ │
│ │ Disk Part. 2 │◄─ Read-only │
│ └───────────────┘ │
│ ... │
└─────────────────────────────────────────┘
Partition Lifecycle
- Active Partition: Accepts new writes, kept in memory
- Buffer Partition: Handles out-of-order writes within recent time window
- Flushing: When active partition is full, it's flushed to disk
- Disk Partitions: Read-only, memory-mapped for efficient queries
- Expiration: Old partitions are automatically removed based on retention
Benefits
- Fast Queries: Skip irrelevant partitions based on time range
- Efficient Memory: Only recent data stays in RAM
- Low Write Amplification: Sequential writes, no compaction needed
- SSD-Friendly: Minimal random I/O patterns
Configuration
StorageBuilder Options
| Option | Description | Default |
|---|---|---|
with_data_path |
Directory for persistent storage | None (in-memory) |
with_retention |
How long to keep data | 14 days |
with_timestamp_precision |
Timestamp precision (ns/μs/ms/s) | Nanoseconds |
with_max_writers |
Maximum concurrent write workers | CPU count |
with_write_timeout |
Timeout for write operations | 30 seconds |
with_partition_duration |
Time range per partition | 1 hour |
with_wal_enabled |
Enable write-ahead logging | true |
with_wal_buffer_size |
WAL buffer size in bytes | 4096 |
Example Configuration
let storage = new
.with_data_path
.with_retention // 30 days
.with_timestamp_precision
.with_max_writers
.with_write_timeout
.with_partition_duration // 6 hours
.with_wal_buffer_size // 16KB
.build?;
Compression
tsink uses the Gorilla compression algorithm, which is specifically designed for time-series data:
- Delta-of-delta encoding for timestamps
- XOR compression for floating-point values
- Typical compression ratio: ~1.37 bytes per data point
This means a data point that would normally take 16 bytes (8 bytes timestamp + 8 bytes value) is compressed to less than 2 bytes on average.
Performance
Benchmarks on AMD Ryzen 7940HS (single core):
| Operation | Throughput | Latency |
|---|---|---|
| Insert single point | 10M ops/sec | ~100ns |
| Batch insert (1000) | 15M points/sec | ~67μs/batch |
| Select 1K points | 4.5M queries/sec | ~220ns |
| Select 1M points | 3.4M queries/sec | ~290ns |
Run benchmarks yourself:
Module Overview
Core Modules
| Module | Description |
|---|---|
storage |
Main storage engine with builder pattern configuration |
partition |
Time-based data partitioning (memory and disk implementations) |
encoding |
Gorilla compression for efficient time-series storage |
wal |
Write-ahead logging for durability and crash recovery |
label |
Multi-dimensional metric labeling and marshaling |
Infrastructure Modules
| Module | Description |
|---|---|
cgroup |
Container-aware CPU and memory limit detection |
mmap |
Platform-optimized memory-mapped file operations |
concurrency |
Worker pools, semaphores, and rate limiters |
bstream |
Bit-level streaming for compression algorithms |
list |
Thread-safe partition list management |
Utility Modules
| Module | Description |
|---|---|
error |
Comprehensive error types with context |
Advanced Usage
Label Querying
tsink provides querying capabilities for metrics with labels:
use ;
// Insert metrics with various label combinations
let rows = vec!;
storage.insert_rows?;
// Method 1: Query with exact label match
let points = storage.select?;
// Method 2: Query all label combinations (discovers all variations)
let all_results = storage.select_all?;
// Process results grouped by labels
for in all_results
Concurrent Operations
tsink is designed for high-concurrency scenarios:
use thread;
use Arc;
let storage = new;
// Spawn multiple writer threads
let mut handles = vec!;
for worker_id in 0..10
// Wait for all threads
for handle in handles
Out-of-Order Insertion
tsink handles out-of-order data points automatically:
// Insert data points in random order
let rows = vec!;
storage.insert_rows?;
// Query returns points in correct chronological order
let points = storage.select?;
assert!;
Container Deployment
tsink automatically detects container resource limits:
// tsink reads cgroup limits automatically
let storage = new
.with_max_writers // 0 = auto-detect from cgroup
.build?;
// In a container with 2 CPU limit, this will use 2 workers
// even if the host has 16 CPUs
WAL Recovery
After a crash, tsink automatically recovers from WAL:
// First run - data is written to WAL
let storage = new
.with_data_path
.build?;
storage.insert_rows?;
// Crash happens here...
// Next run - data is recovered from WAL automatically
let storage = new
.with_data_path // Same path
.build?; // Recovery happens here
// Previously inserted data is available
let points = storage.select?;
Examples
Run the comprehensive example showcasing all features:
Other examples:
basic_usage- Simple insert and query operationspersistent_storage- Disk-based storage with WALproduction_example- Production-ready configuration
Testing
Run the test suite:
# Run all tests
# Run with verbose output
# Run specific test module
Contributing
Contributions are welcome! Please feel free to submit a Pull Request.
Development Setup
# Clone the repository
# Run tests
# Run benchmarks
# Check formatting
# Run clippy
License
- MIT License