otlp2parquet
What if your observability data was just a bunch of Parquet files?
Receive OpenTelemetry logs, metrics, and traces and write them as Parquet files to local disk, cloud storage or Apache Iceberg. Query with DuckDB, Spark, or anything that reads Parquet.

Quick Start
See Deploy to Cloud for running in an AWS Lambda or Cloudflare Worker.
# requires rust toolchain: `curl https://sh.rustup.rs -sSf | sh`
Server starts on http://localhost:4318. Send a simple OTLP HTTP log:
Query it:
# see https://duckdb.org/install
Why?
- Keep monitoring data around a long time Parquet on S3 can be 90% cheaper than large monitoring vendors for long-term analytics.
- Query with good tools — duckDB, Spark, Athena, Trino, Pandas
- Easy Iceberg — Optional catalog support, including S3 Tables and R2 Data Catalog
- Deploy anywhere — Local binary, Cloudflare Workers (WASM), AWS Lambda
Deploy to the Cloud
Once you've kicked the tires locally, deploy to serverless:
Cloudflare Workers + R2 or R2 Data Catalog with wrangler CLI:
# Generates config for workers
# Deploy to Cloudflare
AWS Lambda + S3 or S3 Tables with AWS CLI:
# Generates a Cloudformation template for Lambda + S3
# Deploy with Cloudformation
# Send a log (requires IAM sigv4 auth by default)
Both commands walk you through setup and generate the config files you need.
Supported Signals
Logs, Metrics, Traces via OTLP/HTTP (protobuf or JSON, gzip compression supported). No gRPC support for now.
Stable Surface (v1)
- OTLP/HTTP endpoints:
/v1/logs,/v1/metrics,/v1/traces(protobuf or JSON; gzip supported) - Partition layout:
logs/{service}/year=.../hour=.../{ts}-{uuid}.parquet,metrics/{type}/{service}/...,traces/{service}/... - Storage: filesystem, S3, or R2 with optional Iceberg catalog
- Schemas: ClickHouse-compatible, PascalCase columns; five metric schemas (Gauge, Sum, Histogram, ExponentialHistogram, Summary)
- Error model: HTTP 400 on invalid input/too large; 5xx on conversion/storage
Best-effort catalog commits: Parquet files are always written to storage first. If you enable an Iceberg catalog (S3 Tables, R2 Data Catalog), catalog registration happens after the write. If catalog registration fails (network error, conflict), the data is still safely stored and a warning is logged—your data is never lost due to catalog issues.
Future work (contributions welcome)
- OpenTelemetry Arrow alignment
- Additional platforms: Azure Functions; Kubernetes manifests
- Iceberg ergonomics: queued commits (SQS/Queues), richer partition configs
Learn More
- Batching: Serverless deployments write one file per request. Don't write a lot of small files or your performance and cloud bill will explode. Use an OTel Collector upstream to batch, or enable S3 Tables / R2 Data Catalog for automatic compaction.
- Schema: Uses ClickHouse-compatible column names. Will converge with OTel Arrow (OTAP) when it stabilizes.
- Status: Functional but evolving. API may change.