otlp-arrow-library 0.6.4

Cross-platform Rust library for receiving OTLP messages via gRPC and writing to Arrow IPC files
Documentation
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
# Architecture Documentation

**Project**: OTLP Rust Service  
**Version**: 0.5.0  
**Last Updated**: 2025-01-27

## Table of Contents

1. [System Overview]#system-overview
2. [High-Level Architecture]#high-level-architecture
3. [Data Flow]#data-flow
4. [Component Interactions]#component-interactions
5. [Key Design Decisions]#key-design-decisions
6. [Technology Stack]#technology-stack
7. [Deployment Architecture]#deployment-architecture
8. [Extension Points]#extension-points

---

## System Overview

The OTLP Rust Service is a cross-platform library and standalone service for receiving OpenTelemetry Protocol (OTLP) messages via gRPC and writing them to local files in Arrow IPC Streaming format. The system supports both embedded library usage and standalone service mode, with optional remote forwarding capabilities.

### Core Capabilities

- **Dual Protocol Support**: Simultaneous support for gRPC Protobuf (standard OTLP) and gRPC Arrow Flight (OTAP) on different ports
- **Arrow IPC Storage**: Efficient storage of telemetry data in Arrow IPC Streaming format with automatic file rotation
- **Batch Writing**: Configurable write intervals for efficient disk I/O
- **File Cleanup**: Automatic cleanup of old trace and metric files based on configurable retention intervals
- **Public API**: Embedded library mode with programmatic API for Rust and Python
- **Optional Forwarding**: Forward messages to remote OTLP endpoints with automatic format conversion
- **Circuit Breaker**: Automatic failure handling with circuit breaker pattern for forwarding
- **OpenTelemetry SDK Integration**: Built-in `PushMetricExporter` and `SpanExporter` implementations

---

## High-Level Architecture

```
┌─────────────────────────────────────────────────────────────────┐
│                      OTLP Rust Service                           │
├─────────────────────────────────────────────────────────────────┤
│                                                                  │
│  ┌──────────────┐         ┌──────────────┐                      │
│  │   gRPC       │         │   gRPC      │                      │
│  │  Protobuf    │         │ Arrow Flight│                      │
│  │  Server      │         │   Server     │                      │
│  │  (Port 4317) │         │  (Port 4318)│                      │
│  └──────┬───────┘         └──────┬───────┘                      │
│         │                        │                              │
│         └────────────┬───────────┘                              │
│                      │                                          │
│         ┌────────────▼──────────┐                               │
│         │   OtlpFileExporter    │                               │
│         │  (Format Conversion)   │                               │
│         └────────────┬──────────┘                               │
│                      │                                          │
│         ┌────────────▼──────────┐                               │
│         │    BatchBuffer        │                               │
│         │  (In-Memory Buffer)   │                               │
│         └────────────┬──────────┘                               │
│                      │                                          │
│         ┌────────────▼──────────┐                               │
│         │   Arrow IPC Writer    │                               │
│         │  (File Storage)       │                               │
│         └──────────────────────┘                               │
│                                                                  │
│  ┌──────────────────────────────────────────────┐              │
│  │         Optional Remote Forwarding            │              │
│  │  ┌──────────────────────────────────────┐   │              │
│  │  │   OtlpForwarder                       │   │              │
│  │  │  ┌────────────────────────────────┐   │   │              │
│  │  │  │  Circuit Breaker               │   │   │              │
│  │  │  │  (Failure Handling)            │   │   │              │
│  │  │  └────────────────────────────────┘   │   │              │
│  │  │  ┌────────────────────────────────┐   │   │              │
│  │  │  │  FormatConverter              │   │   │              │
│  │  │  │  (Protobuf ↔ Arrow Flight)    │   │   │              │
│  │  │  └────────────────────────────────┘   │   │              │
│  │  └──────────────────────────────────────┘   │              │
│  └──────────────────────────────────────────────┘              │
│                                                                  │
│  ┌──────────────────────────────────────────────┐              │
│  │         Public API (Embedded Mode)          │              │
│  │  ┌──────────────────────────────────────┐   │              │
│  │  │   OtlpLibrary                        │   │              │
│  │  │  (Rust & Python Bindings)            │   │              │
│  │  └──────────────────────────────────────┘   │              │
│  └──────────────────────────────────────────────┘              │
│                                                                  │
└─────────────────────────────────────────────────────────────────┘
```

### Architecture Layers

1. **Ingestion Layer**: gRPC servers (Protobuf and Arrow Flight) receive OTLP messages
2. **Processing Layer**: Format conversion, buffering, and batching
3. **Storage Layer**: Arrow IPC file writing with rotation and cleanup
4. **Forwarding Layer** (Optional): Remote forwarding with circuit breaker protection
5. **API Layer**: Public API for embedded usage (Rust and Python)

---

## Data Flow

### Trace Data Flow

```
Client SDK/Application
    ├─→ gRPC Protobuf (Port 4317)
    │       │
    │       └─→ TraceServiceImpl
    │               │
    │               └─→ OtlpFileExporter.export_trace()
    │                       │
    │                       └─→ BatchBuffer.add_trace()
    │                               │
    │                               └─→ [Buffered in Memory]
    │                                       │
    │                                       └─→ [Periodic Write]
    │                                               │
    │                                               └─→ Arrow IPC Writer
    │                                                       │
    │                                                       └─→ File: traces/*.arrow
    ├─→ gRPC Arrow Flight (Port 4318)
    │       │
    │       └─→ OtlpArrowFlightServer
    │               │
    │               └─→ OtlpFileExporter.export_trace()
    │                       │
    │                       └─→ [Same as Protobuf path]
    └─→ Public API (Rust/Python)
            └─→ OtlpLibrary.export_trace()
                    └─→ [Same as gRPC path]
```

### Metrics Data Flow

```
Client SDK/Application
    ├─→ gRPC Protobuf (Port 4317)
    │       │
    │       └─→ MetricsServiceImpl
    │               │
    │               └─→ OtlpFileExporter.export_metrics()
    │                       │
    │                       ├─→ [Convert Protobuf → Arrow RecordBatch]
    │                       │
    │                       └─→ BatchBuffer.add_metrics_protobuf()
    │                               │
    │                               └─→ [Buffered in Memory]
    │                                       │
    │                                       └─→ [Periodic Write]
    │                                               │
    │                                               └─→ Arrow IPC Writer
    │                                                       │
    │                                                       └─→ File: metrics/*.arrow
    ├─→ gRPC Arrow Flight (Port 4318)
    │       │
    │       └─→ OtlpArrowFlightServer
    │               │
    │               └─→ OtlpFileExporter.export_metrics()
    │                       │
    │                       └─→ [Same as Protobuf path]
    └─→ Public API (Rust/Python)
            └─→ OtlpLibrary.export_metrics()
                    └─→ [Same as gRPC path]
```

### Forwarding Data Flow (Optional)

```
BatchBuffer (on write)
    └─→ OtlpForwarder.forward_traces() / forward_metrics()
            ├─→ Circuit Breaker Check
            │       │
            │       ├─→ [Open] → Reject immediately
            │       ├─→ [HalfOpen] → Allow one test request
            │       └─→ [Closed] → Proceed normally
            └─→ FormatConverter
                    ├─→ [If target is Protobuf] → Convert Arrow → Protobuf
                    └─→ [If target is Arrow Flight] → Keep Arrow format
                            └─→ HTTP/gRPC Client
                                    └─→ Remote OTLP Endpoint
```

---

## Component Interactions

### Core Components

#### 1. OtlpGrpcServer (`src/otlp/server.rs`)
- **Purpose**: gRPC server for Protobuf OTLP messages
- **Responsibilities**:
  - Receives `ExportTraceServiceRequest` and `ExportMetricsServiceRequest`
  - Delegates to `OtlpFileExporter` for processing
  - Handles gRPC request/response lifecycle
- **Dependencies**: `OtlpFileExporter`, `tonic` gRPC framework

#### 2. OtlpArrowFlightServer (`src/otlp/server_arrow.rs`)
- **Purpose**: gRPC server for Arrow Flight IPC (OTAP) messages
- **Responsibilities**:
  - Receives Arrow Flight `FlightData` streams
  - Extracts Arrow RecordBatches from Flight messages
  - Delegates to `OtlpFileExporter` for processing
- **Dependencies**: `OtlpFileExporter`, `arrow-flight` crate

#### 3. OtlpFileExporter (`src/otlp/exporter.rs`)
- **Purpose**: Central exporter for all OTLP data
- **Responsibilities**:
  - Format conversion (Protobuf ↔ Arrow)
  - Buffering via `BatchBuffer`
  - Coordinating writes and cleanup
  - Optional forwarding via `OtlpForwarder`
- **Dependencies**: `BatchBuffer`, `OtlpForwarder` (optional), format converters

#### 4. BatchBuffer (`src/otlp/batch_writer.rs`)
- **Purpose**: In-memory buffer for batching OTLP messages
- **Responsibilities**:
  - Thread-safe buffering of traces and metrics
  - Capacity limit enforcement
  - Providing batches for periodic writes
- **Concurrency**: Uses `Arc<Mutex<Vec<T>>>` for thread-safe access
- **Dependencies**: None (core data structure)

#### 5. OtlpForwarder (`src/otlp/forwarder.rs`)
- **Purpose**: Remote forwarding with failure handling
- **Responsibilities**:
  - Format conversion for target protocol
  - Circuit breaker pattern for failure handling
  - HTTP/gRPC client management
  - Authentication handling
- **Dependencies**: `FormatConverter`, `reqwest` HTTP client, circuit breaker logic

#### 6. OtlpLibrary (`src/api/public.rs`)
- **Purpose**: Public API for embedded usage
- **Responsibilities**:
  - High-level API for Rust and Python applications
  - Background task management (writes, cleanup)
  - Configuration management
  - Dashboard server coordination
- **Dependencies**: `OtlpFileExporter`, `BatchBuffer`, `DashboardServer`

### Component Communication Patterns

1. **Request-Response**: gRPC servers receive requests and delegate to exporters
2. **Async Buffering**: Messages are buffered asynchronously, written periodically
3. **Event-Driven**: Background tasks trigger writes and cleanup based on intervals
4. **Circuit Breaker**: Forwarding uses circuit breaker pattern to handle failures gracefully

---

## Key Design Decisions

### 1. Dual Protocol Support

**Decision**: Support both gRPC Protobuf and Arrow Flight simultaneously on different ports.

**Rationale**:
- Protobuf is the standard OTLP format, ensuring compatibility
- Arrow Flight provides better performance for high-throughput scenarios
- Different ports allow clients to choose the appropriate protocol

**Implementation**: Separate server implementations (`OtlpGrpcServer` and `OtlpArrowFlightServer`) that share the same `OtlpFileExporter`.

### 2. Batch Buffering

**Decision**: Buffer messages in memory and write in batches at configurable intervals.

**Rationale**:
- Reduces disk I/O overhead
- Improves throughput for high-volume scenarios
- Configurable intervals allow tuning for different use cases

**Implementation**: `BatchBuffer` uses `Arc<Mutex<Vec<T>>>` for thread-safe buffering with capacity limits.

### 3. Arrow IPC Storage Format

**Decision**: Store telemetry data in Arrow IPC Streaming format.

**Rationale**:
- Efficient columnar format for analytics
- Cross-language compatibility
- Efficient compression and querying capabilities

**Implementation**: All data is converted to Arrow RecordBatches and written as Arrow IPC Streaming files.

### 4. Circuit Breaker for Forwarding

**Decision**: Implement circuit breaker pattern for remote forwarding failures.

**Rationale**:
- Prevents cascading failures when remote endpoints are down
- Reduces unnecessary network traffic during outages
- Automatic recovery when remote endpoint recovers

**Implementation**: Three-state circuit breaker (Closed, Open, HalfOpen) with configurable thresholds and timeouts.

### 5. Format Conversion

**Decision**: Automatic format conversion between Protobuf and Arrow Flight.

**Rationale**:
- Allows clients to use either protocol regardless of storage format
- Enables forwarding to endpoints with different protocol requirements
- Maintains data fidelity during conversion

**Implementation**: `FormatConverter` handles bidirectional conversion between Protobuf and Arrow formats.

### 6. Thread-Safe Concurrency Model

**Decision**: Use `Arc<Mutex<T>>` for shared state and async/await for concurrency.

**Rationale**:
- Rust's ownership system prevents data races
- Async/await provides efficient I/O concurrency
- Mutex ensures thread-safe access to shared buffers

**Implementation**: All shared state (buffers, circuit breaker state) uses `Arc<Mutex<T>>` or `Arc<RwLock<T>>` where appropriate.

### 7. Embedded vs Standalone Modes

**Decision**: Support both embedded library usage and standalone service mode.

**Rationale**:
- Embedded mode allows integration into existing applications
- Standalone mode provides a ready-to-use service
- Same core components support both modes

**Implementation**: `OtlpLibrary` provides public API for embedded usage, while `main.rs` provides standalone service.

---

## Technology Stack

### Core Dependencies

- **Rust**: Edition 2024, latest stable version
- **Tokio**: Async runtime (`tokio` 1.35+)
- **OpenTelemetry**: OTLP protocol support (`opentelemetry` 0.31, `opentelemetry-sdk` 0.31)
- **Arrow**: Columnar data format (`arrow` 57, `arrow-flight` 57)
- **gRPC**: `tonic` 0.14 (Protobuf), `arrow-flight` (Arrow Flight)
- **HTTP Client**: `reqwest` 0.11 (for forwarding)
- **Serialization**: `serde` 1.0, `prost` 0.14 (Protobuf)

### Python Integration

- **PyO3**: Python bindings (`pyo3` 0.20)
- **Maturin**: Python package building

### Testing & Development

- **Testing**: `tokio-test` 0.4 (async testing utilities)
- **Benchmarking**: `criterion` 0.5 (performance benchmarks)
- **Mocking**: `wiremock` 0.6 (HTTP mocking for tests)

### Security

- **secrecy**: Secure string types for credential storage
- **url**: Comprehensive URL parsing and validation

---

## Deployment Architecture

### Standalone Service Mode

```
┌─────────────────────────────────────┐
│   otlp-arrow-service (Binary)       │
│                                     │
│  ┌───────────────────────────────┐  │
│  │  Configuration Loader         │  │
│  │  (YAML, Environment, API)    │  │
│  └──────────────┬────────────────┘  │
│                 │                     │
│  ┌──────────────▼────────────────┐  │
│  │  OtlpLibrary                  │  │
│  │  ┌──────────────────────────┐ │  │
│  │  │  gRPC Servers           │ │  │
│  │  │  (Protobuf + Arrow)      │ │  │
│  │  └──────────────────────────┘ │  │
│  │  ┌──────────────────────────┐ │  │
│  │  │  Background Tasks        │ │  │
│  │  │  (Writes, Cleanup)      │ │  │
│  │  └──────────────────────────┘ │  │
│  └────────────────────────────────┘  │
│                 │                     │
│  ┌──────────────▼────────────────┐  │
│  │  File System                  │  │
│  │  ./output_dir/otlp/          │  │
│  │    ├── traces/*.arrow        │  │
│  │    └── metrics/*.arrow       │  │
│  └────────────────────────────────┘  │
└─────────────────────────────────────┘
```

### Embedded Library Mode

```
┌─────────────────────────────────────┐
│   Application (Rust/Python)         │
│                                     │
│  ┌───────────────────────────────┐ │
│  │  OtlpLibrary (Public API)     │ │
│  │  - export_trace()              │ │
│  │  - export_metrics()            │ │
│  │  - flush()                     │ │
│  └──────────────┬──────────────────┘ │
│                 │                     │
│  ┌──────────────▼────────────────┐  │
│  │  Same Core Components         │  │
│  │  (OtlpFileExporter, etc.)     │  │
│  └────────────────────────────────┘  │
└─────────────────────────────────────┘
```

### Runtime Characteristics

- **Concurrency**: Async/await with Tokio runtime
- **Memory**: Bounded buffers with configurable capacity limits
- **Disk I/O**: Batched writes at configurable intervals
- **Network**: Async HTTP/gRPC clients for forwarding
- **Error Handling**: Circuit breaker pattern for forwarding failures

---

## Extension Points

### 1. Custom Exporters

**Location**: `src/otlp/exporter.rs`

The `OtlpFileExporter` can be extended to support additional export formats or destinations. The exporter interface is designed to be composable.

### 2. Format Converters

**Location**: `src/otlp/converter.rs`

New format conversions can be added by extending the `FormatConverter` trait or adding new conversion methods.

### 3. Authentication Methods

**Location**: `src/config/types.rs` (`AuthConfig`)

New authentication methods can be added by extending the `AuthConfig` enum and updating the forwarding logic.

### 4. Storage Backends

**Location**: `src/otlp/exporter.rs` (`OtlpFileExporter`)

The storage layer can be extended to support additional backends (e.g., object storage, databases) by implementing a storage trait.

### 5. Circuit Breaker Strategies

**Location**: `src/otlp/forwarder.rs` (`CircuitBreaker`)

Different circuit breaker strategies (e.g., sliding window, adaptive thresholds) can be implemented by extending the `CircuitBreaker` struct.

### 6. Dashboard Extensions

**Location**: `dashboard/` and `src/dashboard/`

The dashboard can be extended with new visualizations, filters, or data sources.

---

## Related Documentation

- [README.md]../README.md: User-facing documentation and quick start guide
- [docs/metrics-flow-diagram.md]metrics-flow-diagram.md: Detailed metrics import/export flow
- [specs/]../specs/: Feature specifications and implementation plans

---

## References

- [OpenTelemetry Protocol Specification]https://github.com/open-telemetry/opentelemetry-specification
- [Apache Arrow Documentation]https://arrow.apache.org/docs/
- [Circuit Breaker Pattern]https://martinfowler.com/bliki/CircuitBreaker.html
- [Tokio Async Runtime]https://tokio.rs/