1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
//! # wf-connector-api
//!
//! Minimal Arrow-native connector API for warp-fusion.
//!
//! ## Design
//!
//! `wp-connector-api` sources produce `SourceEvent { payload: RawData }`,
//! designed for downstream parse pipelines. CEP engines like warp-fusion
//! operate on Arrow `RecordBatch` directly.
//!
//! `wf-connector-api` fills this gap — one trait for sources, extensible
//! to sinks in the future (e.g. `BatchSink` for Arrow-native output).
//!
//! ## Relationship with `wp-connector-api`
//!
//! | | wp-connector-api | wf-connector-api |
//! |---|---|---|
//! | Source data | `SourceEvent { payload: RawData }` | `RecordBatch` (columnar) |
//! | Consumer | parse pipeline (WPL) | CEP engine (warp-fusion) |
//! | Error model | `SourceResult<T>` (orion-error) | `SourceResult<T>` (orion-error) |
//! | Lifecycle | `start()` / `receive()` / `close()` | `start()` / `receive_batch()` / `close()` |
//!
//! `wp-connectors` (the implementation crate) can implement BOTH traits
//! for the same connector (Kafka / File / TCP), sharing connection logic.
use RecordBatch;
use async_trait;
use ToStructError;
use ;
use Error as StdError;
// -- Error -------------------------------------------------------------------
/// Connector error reason.
///
/// All leaf variants carry detail via `err_detail()`. `SourceError` wraps
/// each variant with a detail string and optional source error.
pub type SourceError = ;
pub type SourceResult<T> = ;
// -- Source ------------------------------------------------------------------
/// A batch-oriented data source that produces Arrow [`RecordBatch`]es.
///
/// # Lifecycle
///
/// 1. `start()` — initialize (connect, subscribe, bind)
/// 2. `receive_batch()` — pull data in a loop
/// 3. `close()` — release resources (unsubscribe, close connections)
///
/// `close()` must be idempotent — safe to call multiple times, even before `start()`.
///
/// # Empty vs EOF
///
/// - Return `Ok(vec![])` when no data is currently available (caller should retry).
/// - Return `Err(SourceReason::EOF.into())` when the stream has ended.
// -- Sink (TBD) --------------------------------------------------------------
// Future extension:
//
// ```ignore
// #[async_trait]
// pub trait BatchSink: Send {
// async fn start(&mut self) -> SourceResult<()> { Ok(()) }
// async fn send_batch(&mut self, batch: &RecordBatch) -> SourceResult<()>;
// async fn flush(&mut self) -> SourceResult<()>;
// async fn close(&mut self) -> SourceResult<()> { Ok(()) }
// fn identifier(&self) -> &str;
// }
// ```