manifold-timeseries
Time-series storage optimizations for the Manifold embedded database.
Overview
manifold-timeseries provides ergonomic, type-safe wrappers around Manifold's core primitives for storing and querying time-series data with multi-granularity downsampling and retention policies. It does not implement time-series analytics (forecasting, anomaly detection) - instead, it focuses on efficient persistent storage and provides integration traits for external analytics libraries.
Features
- Dual encoding strategies - Absolute (default) or delta encoding for timestamps
- Multi-granularity tables - Raw, minute, hour, and day aggregates
- Manual downsampling - Compute aggregates (min, max, avg, sum, count, last)
- Retention policies - Time-based cleanup of old data
- High performance - Leverages Manifold's WAL group commit and ordered key-value storage
- Integration ready -
TimeSeriesSourcetrait for external analytics libraries
Quick Start
Basic Usage
use ColumnFamilyDatabase;
use ;
// Open database and column family
let db = open?;
let cf = db.column_family_or_create?;
// Write time series data
// Read time series data
let read_txn = cf.begin_read?;
let ts = open?;
// Query a specific point
if let Some = ts.get?
// Range query
let start = 1609459200000;
let end = 1609459260000;
for point in ts.range?
Batch Operations
For high-throughput metric ingestion:
let points = vec!;
let write_txn = cf.begin_write?;
let mut ts = open?;
// Batch write for better performance
ts.write_batch?;
drop;
write_txn.commit?;
Timestamp Encoding Strategies
Absolute Encoding (Default)
Stores timestamps as 8-byte big-endian u64 values:
use AbsoluteEncoding;
let mut ts = open?;
Best for:
- Sparse or irregular time series
- Random access patterns
- Ad-hoc queries
Storage: 8 bytes per timestamp (fixed)
Delta Encoding
Stores timestamps as varint-compressed deltas with periodic checkpoints:
use DeltaEncoding;
let mut ts = open?;
Best for:
- Dense, regular-interval data (e.g., 1-second IoT sensors)
- Storage-constrained environments
- Sequential scan workloads
Storage: 1-9 bytes per timestamp (variable, typically 1-2 bytes for regular intervals)
Multi-Granularity Support
Each TimeSeriesTable maintains four internal tables for efficient queries at different time scales:
- Raw - Original data points (per-second or per-millisecond)
- Minute - Aggregated per-minute data
- Hour - Aggregated per-hour data
- Day - Aggregated per-day data
Aggregates
Each aggregate contains:
min: f32- Minimum value in the windowmax: f32- Maximum value in the windowsum: f32- Sum of all valuescount: u64- Number of data pointslast: f32- Most recent value
Downsampling
Convert raw data to aggregates:
use Granularity;
let write_txn = cf.begin_write?;
let mut ts = open?;
// Downsample last hour of raw data to minute aggregates
let start_ms = now_ms - ; // 1 hour ago
let count = ts.downsample_to_minute?;
println!;
// Downsample minute data to hour aggregates
let count = ts.downsample_minute_to_hour?;
println!;
drop;
write_txn.commit?;
Query aggregates:
let read_txn = cf.begin_read?;
let ts = open?;
// Get minute-level aggregate
let agg = ts.get_aggregate?;
if let Some = agg
// Range query over hour aggregates
for result in ts.range_aggregates?
Retention Policies
Delete old data to manage storage:
use Duration;
let write_txn = cf.begin_write?;
let mut ts = open?;
// Keep only last 7 days of raw data
ts.apply_retention?;
// Apply multiple retention policies at once
ts.apply_all_retentions?;
drop;
write_txn.commit?;
Architecture
Storage Layout
Each time series table creates four internal Manifold tables:
{name}_raw → (timestamp: u64, series_id: &str) → value: f32
{name}_minute → (timestamp: u64, series_id: &str) → aggregate: Aggregate
{name}_hour → (timestamp: u64, series_id: &str) → aggregate: Aggregate
{name}_day → (timestamp: u64, series_id: &str) → aggregate: Aggregate
All tables share the same composite key structure for efficient range queries.
Performance Characteristics
- Write (single point): O(log n) B-tree insert
- Write (batch): Amortized O(log n) with WAL group commit
- Read (single point): O(log n) lookup
- Range query: O(log n) + O(k) where k = points in range
- Downsampling: O(k) scan + O(m log n) aggregate writes where m = buckets
- Key size: 8 bytes (timestamp) + series_id length
- Value size: 4 bytes (raw) or 24 bytes (aggregate)
Aggregate Storage Format
Aggregates are stored as fixed-width 24-byte values:
[min: f32][max: f32][sum: f32][count: u64][last: f32]
4 bytes 4 bytes 4 bytes 8 bytes 4 bytes
Examples
The crate includes comprehensive examples demonstrating real-world usage:
1. Metrics Collection (examples/metrics_collection.rs)
Real system metrics collection:
- CPU usage tracking
- Memory monitoring
- Using sysinfo crate for real data
- Time series storage patterns
2. IoT Sensors (examples/iot_sensors.rs)
IoT sensor data simulation:
- Multiple sensor types (temperature, humidity, pressure)
- Batch ingestion
- Range queries
- Statistics computation
3. Downsampling Lifecycle (examples/downsampling_lifecycle.rs)
Complete downsampling workflow:
- Raw data ingestion
- Multi-level downsampling (raw → minute → hour → day)
- Retention policy application
- Query optimization strategies
Use Cases
- Application monitoring - System metrics, performance counters
- IoT data - Sensor readings, telemetry
- Financial data - Stock prices, trading volumes
- Analytics - User activity, event tracking
- DevOps - Infrastructure monitoring, alerting
- Scientific data - Experimental measurements, logging
Combining with Other Domain Layers
manifold-timeseries works seamlessly with other manifold domain layers:
// Store entity embeddings in vectors
let vectors_cf = db.column_family_or_create?;
let mut vectors = open?;
vectors.insert?;
// Store entity relationships in graph
let graph_cf = db.column_family_or_create?;
let mut graph = open?;
graph.add_edge?;
// Store entity metrics in time series
let metrics_cf = db.column_family_or_create?;
let mut ts = open?;
ts.write?;
Integration with Analytics Libraries
The TimeSeriesSource trait enables integration with external time-series analytics libraries:
use TimeSeriesSource;
let read_txn = cf.begin_read?;
let ts = open?;
// Use the trait to integrate with external libraries
let points: = ts.range_raw?
.?;
// Pass to analytics library for forecasting, anomaly detection, etc.
// (example with hypothetical library)
let forecast = forecast?;
Requirements
- Rust 1.70+ (for const generics)
manifoldversion 3.1+
Performance Tips
- Use batch operations for bulk ingestion - reduces transaction overhead
- Downsample regularly - Query aggregates instead of raw data when possible
- Apply retention policies - Delete old raw data after downsampling
- Pre-sort batch data when possible - set
sorted: truefor better performance - Use appropriate granularity - Don't query raw data for long time ranges
- Choose the right encoding:
- Absolute: Default, supports random access
- Delta: Better compression for regular-interval data
License
Licensed under either of:
- Apache License, Version 2.0 (LICENSE-APACHE)
- MIT License (LICENSE-MIT)
at your option.
Contributing
Contributions are welcome! This crate follows the patterns established in the manifold domain layer architecture.
Related Crates
- manifold - Core embedded database
- manifold-vectors - Vector storage for embeddings
- manifold-graph - Graph storage for relationships