pub struct CollectArrowBridge { /* private fields */ }Expand description
Auto-flushing bridge from row-level data collection to Arrow batches.
Rows are pushed via push_row. When the internal builder
reaches batch_size, it is automatically flushed to the internal batch
buffer. Call take_batches to retrieve all completed
batches (including any partial data), or drain_completed
to retrieve only full batches without flushing partial data.
Implementations§
Source§impl CollectArrowBridge
impl CollectArrowBridge
Sourcepub fn new(schema: SchemaRef, batch_size: usize) -> Result<Self, BridgeError>
pub fn new(schema: SchemaRef, batch_size: usize) -> Result<Self, BridgeError>
Create a new bridge.
§Arguments
schema- Arrow schema for the batches.batch_size- number of rows per batch before auto-flush.
Sourcepub fn push_row(&mut self, values: &[ArrowValue]) -> Result<(), BridgeError>
pub fn push_row(&mut self, values: &[ArrowValue]) -> Result<(), BridgeError>
Append a row. Auto-flushes to a new batch when batch_size is reached.
Sourcepub fn flush(&mut self) -> Result<(), BridgeError>
pub fn flush(&mut self) -> Result<(), BridgeError>
Flush any partial data in the builder to a batch.
Sourcepub fn take_batches(&mut self) -> Result<Vec<RecordBatch>, BridgeError>
pub fn take_batches(&mut self) -> Result<Vec<RecordBatch>, BridgeError>
Flush and return all accumulated batches (including partial data).
Sourcepub fn drain_completed(&mut self) -> Vec<RecordBatch>
pub fn drain_completed(&mut self) -> Vec<RecordBatch>
Take only batches that have already been auto-flushed, without flushing partial data.
Sourcepub fn restore_completed(&mut self, batches: Vec<RecordBatch>)
pub fn restore_completed(&mut self, batches: Vec<RecordBatch>)
Restore completed batches back into the bridge in their original order.
This is used when a downstream delivery stage fails after batches have already been drained from the bridge but before they were durably handed off to the next stage.
Sourcepub fn batches(&self) -> &[RecordBatch]
pub fn batches(&self) -> &[RecordBatch]
Borrow the accumulated (auto-flushed) batches.
Sourcepub fn pending_rows(&self) -> usize
pub fn pending_rows(&self) -> usize
Number of rows currently buffered but not yet flushed.
Sourcepub fn completed_batches(&self) -> usize
pub fn completed_batches(&self) -> usize
Number of completed batches waiting to be drained.
Sourcepub fn completed_rows(&self) -> usize
pub fn completed_rows(&self) -> usize
Number of rows contained in completed batches waiting to be drained.
Sourcepub fn batch_size(&self) -> usize
pub fn batch_size(&self) -> usize
Configured auto-flush batch size.
Sourcepub fn metrics(&self) -> BridgeMetrics
pub fn metrics(&self) -> BridgeMetrics
Snapshot of current bridge metrics.
Sourcepub fn total_rows(&self) -> usize
pub fn total_rows(&self) -> usize
Total row count (flushed + pending).