Skip to main content

StreamingWriter

Trait StreamingWriter 

Source
pub trait StreamingWriter: Send {
    // Required methods
    fn write_header(
        &mut self,
        metadata: &QueryMetadata,
    ) -> Result<(), OutputError>;
    fn write_chunk(&mut self, rows: &[QueryRow]) -> Result<usize, OutputError>;
    fn finalize(&mut self) -> Result<(), OutputError>;
    fn rows_written(&self) -> u64;

    // Provided method
    fn bytes_written(&self) -> Option<u64> { ... }
}
Expand description

Trait for writers that support streaming/chunked output

Unlike batch writers that receive a complete QueryResult, streaming writers process data incrementally via write_header(), write_chunk(), and finalize(). This enables processing of arbitrarily large result sets within the 128MB memory budget.

§Memory Budget

To stay within the 128MB target:

  • Chunk sizes should typically be 5,000-10,000 rows
  • For large blob/text columns, use smaller chunks (1,000-5,000)
  • Parquet writers buffer rows for row groups; default is 10,000 rows

§Contract

  1. write_header() MUST be called exactly once before any write_chunk() calls
  2. write_chunk() MAY be called zero or more times
  3. finalize() MUST be called exactly once to complete the output
  4. After finalize(), no further calls are allowed
  5. Implementors SHOULD return errors rather than panic on contract violations

§Troubleshooting

If you encounter OOM errors:

  1. Reduce chunk sizes when calling write_chunk() (max: 10,000 rows)
  2. For Parquet, reduce row_group_size (default: 10,000 rows)
  3. Check for large blob/text columns that inflate row size

§Example

let mut writer = StreamingCSVWriter::new(file);
writer.write_header(&metadata)?;

for chunk in result_stream.chunks(10_000) {
    writer.write_chunk(&chunk)?;
}

writer.finalize()?;

Required Methods§

Source

fn write_header(&mut self, metadata: &QueryMetadata) -> Result<(), OutputError>

Initialize writer with column metadata (write header if applicable)

Called once before any data is written. For CSV, this writes the header row. For Parquet, this initializes the Arrow schema.

Source

fn write_chunk(&mut self, rows: &[QueryRow]) -> Result<usize, OutputError>

Write a chunk of rows (called multiple times during streaming)

Returns the number of rows actually written (may differ from input if the writer buffers internally, e.g., Parquet row groups).

Source

fn finalize(&mut self) -> Result<(), OutputError>

Finalize output (flush buffers, write footer, close resources)

Called once after all data has been written. For Parquet, this writes the final row group and file footer.

Source

fn rows_written(&self) -> u64

Get count of rows written so far

Intended for progress reporting; not yet integrated into export functions.

Provided Methods§

Source

fn bytes_written(&self) -> Option<u64>

Get bytes written so far (if trackable)

Intended for progress reporting; not yet integrated into export functions.

Dyn Compatibility§

This trait is dyn compatible.

In older versions of Rust, dyn compatibility was called "object safety".

Implementors§