Expand description
Parquet-compatible span record schema (Sprint 40 - Golden Thread Core)
This module defines the canonical schema for storing OpenTelemetry spans in trueno-db’s Parquet-backed storage. The schema is optimized for:
- Query performance: Flat structure for predicate pushdown
- Compression: Columnar layout with RLE/dictionary encoding
- W3C Trace Context: Native support for traceparent format
- Causal ordering: Lamport logical clock for happens-before
§Design Principles
- Flat Structure: Parquet performs best with flat schemas (no deep nesting)
- Fixed-size IDs:
trace_id(16 bytes),span_id(8 bytes) for efficient indexing - JSON Attributes: Flexible key-value pairs stored as JSON string
- Timestamp Precision: Nanosecond precision for microsecond-level tracing
- Logical Causality: Lamport clock field for provable ordering
§Parquet Schema Mapping
SpanRecord (Rust) → Parquet Physical Type
├─ trace_id: [u8; 16] → FIXED_LEN_BYTE_ARRAY(16)
├─ span_id: [u8; 8] → FIXED_LEN_BYTE_ARRAY(8)
├─ parent_span_id: Option<..> → FIXED_LEN_BYTE_ARRAY(8), nullable=true
├─ span_name: String → BYTE_ARRAY (UTF8)
├─ span_kind: SpanKind → INT32 (enum)
├─ start_time_nanos: u64 → INT64
├─ end_time_nanos: u64 → INT64
├─ logical_clock: u64 → INT64 (Lamport timestamp)
├─ duration_nanos: u64 → INT64 (computed: end - start)
├─ status_code: StatusCode → INT32 (enum)
├─ status_message: String → BYTE_ARRAY (UTF8)
├─ attributes_json: String → BYTE_ARRAY (UTF8) - JSON map
├─ resource_json: String → BYTE_ARRAY (UTF8) - JSON map
├─ process_id: u32 → INT32
└─ thread_id: u64 → INT64§Query Patterns
The schema is optimized for these access patterns:
-- Critical path queries (p95 <20ms for 1M spans)
SELECT * FROM spans WHERE trace_id = ?
SELECT * FROM spans WHERE trace_id = ? ORDER BY logical_clock
SELECT * FROM spans WHERE trace_id = ? AND parent_span_id IS NULL
-- Temporal range queries
SELECT * FROM spans WHERE start_time_nanos BETWEEN ? AND ?
-- Process/thread filtering
SELECT * FROM spans WHERE process_id = ? AND thread_id = ?
-- Status filtering (error analysis)
SELECT * FROM spans WHERE status_code = 2 -- ERROR§Peer-Reviewed Foundation
-
Melnik et al. (2010). “Dremel: Interactive Analysis of Web-Scale Datasets.” Google.
- Finding: Columnar storage with nested encoding enables <1s queries on trillion-row tables
- Application: Parquet schema optimized for predicate pushdown
-
Lamb et al. (2012). “The Vertica Analytic Database.” VLDB.
- Finding: Column-store compression (RLE, dictionary) achieves 10-50× reduction
- Application: Fixed-size IDs and enums for optimal compression
Structs§
- Span
Record - Span record compatible with trueno-db Parquet storage
Enums§
- Span
Kind - Span kind (OpenTelemetry semantic convention)
- Status
Code - Span status code (OpenTelemetry semantic convention)