Expand description
Arrow writing module for converting Arrow record batches to Iceberg data files.
This module provides functionality to:
- Write Arrow record batches to Parquet files
- Handle partitioned data writing
- Support equality delete files
- Manage file sizes and buffering
The main entry points are:
write_parquet_partitioned: Write regular data fileswrite_equality_deletes_parquet_partitioned: Write equality delete files
The module handles:
- Automatic file size management and splitting
- Parquet compression and encoding
- Partition path generation
- Object store integration
- Metadata collection for written files
§Example
let data_files = write_parquet_partitioned(
table,
batches,
None // no specific branch
).await.unwrap();Functions§
- generate_
file_ path - Generates a unique file path for a Parquet data file.
- generate_
partition_ path - Generates a partition path string from partition fields and their values.
- write_
equality_ deletes_ parquet_ partitioned - Writes equality delete records as partitioned Parquet files.
- write_
parquet_ partitioned - Writes Arrow record batches as partitioned Parquet files.