Struct deltalake::datafusion::parquet::arrow::arrow_reader::ArrowReaderBuilder
source · pub struct ArrowReaderBuilder<T> { /* private fields */ }
Expand description
A generic builder for constructing sync or async arrow parquet readers. This is not intended to be used directly, instead you should use the specialization for the type of reader you wish to use
- For a synchronous API -
ParquetRecordBatchReaderBuilder
- For an asynchronous API -
ParquetRecordBatchStreamBuilder
Implementations§
source§impl<T> ArrowReaderBuilder<T>
impl<T> ArrowReaderBuilder<T>
sourcepub fn metadata(&self) -> &Arc<ParquetMetaData>
pub fn metadata(&self) -> &Arc<ParquetMetaData>
Returns a reference to the ParquetMetaData
for this parquet file
sourcepub fn parquet_schema(&self) -> &SchemaDescriptor
pub fn parquet_schema(&self) -> &SchemaDescriptor
Returns the parquet SchemaDescriptor
for this parquet file
sourcepub fn with_batch_size(self, batch_size: usize) -> ArrowReaderBuilder<T>
pub fn with_batch_size(self, batch_size: usize) -> ArrowReaderBuilder<T>
Set the size of RecordBatch
to produce. Defaults to 1024
If the batch_size more than the file row count, use the file row count.
sourcepub fn with_row_groups(self, row_groups: Vec<usize>) -> ArrowReaderBuilder<T>
pub fn with_row_groups(self, row_groups: Vec<usize>) -> ArrowReaderBuilder<T>
Only read data from the provided row group indexes
sourcepub fn with_projection(self, mask: ProjectionMask) -> ArrowReaderBuilder<T>
pub fn with_projection(self, mask: ProjectionMask) -> ArrowReaderBuilder<T>
Only read data from the provided column indexes
sourcepub fn with_row_selection(
self,
selection: RowSelection
) -> ArrowReaderBuilder<T>
pub fn with_row_selection( self, selection: RowSelection ) -> ArrowReaderBuilder<T>
Provide a RowSelection
to filter out rows, and avoid fetching their
data into memory.
Row group filtering is applied prior to this, and therefore rows from skipped
row groups should not be included in the RowSelection
An example use case of this would be applying a selection determined by
evaluating predicates against the Index
It is recommended to enable reading the page index if using this functionality, to allow
more efficient skipping over data pages. See ArrowReaderOptions::with_page_index
sourcepub fn with_row_filter(self, filter: RowFilter) -> ArrowReaderBuilder<T>
pub fn with_row_filter(self, filter: RowFilter) -> ArrowReaderBuilder<T>
Provide a RowFilter
to skip decoding rows
Row filters are applied after row group selection and row selection
It is recommended to enable reading the page index if using this functionality, to allow
more efficient skipping over data pages. See ArrowReaderOptions::with_page_index
.
sourcepub fn with_limit(self, limit: usize) -> ArrowReaderBuilder<T>
pub fn with_limit(self, limit: usize) -> ArrowReaderBuilder<T>
Provide a limit to the number of rows to be read
The limit will be applied after any Self::with_row_selection
and Self::with_row_filter
allowing it to limit the final set of rows decoded after any pushed down predicates
It is recommended to enable reading the page index if using this functionality, to allow
more efficient skipping over data pages. See ArrowReaderOptions::with_page_index
sourcepub fn with_offset(self, offset: usize) -> ArrowReaderBuilder<T>
pub fn with_offset(self, offset: usize) -> ArrowReaderBuilder<T>
Provide an offset to skip over the given number of rows
The offset will be applied after any Self::with_row_selection
and Self::with_row_filter
allowing it to skip rows after any pushed down predicates
It is recommended to enable reading the page index if using this functionality, to allow
more efficient skipping over data pages. See ArrowReaderOptions::with_page_index
Auto Trait Implementations§
impl<T> Freeze for ArrowReaderBuilder<T>where
T: Freeze,
impl<T> !RefUnwindSafe for ArrowReaderBuilder<T>
impl<T> Send for ArrowReaderBuilder<T>where
T: Send,
impl<T> !Sync for ArrowReaderBuilder<T>
impl<T> Unpin for ArrowReaderBuilder<T>where
T: Unpin,
impl<T> !UnwindSafe for ArrowReaderBuilder<T>
Blanket Implementations§
source§impl<T> BorrowMut<T> for Twhere
T: ?Sized,
impl<T> BorrowMut<T> for Twhere
T: ?Sized,
source§fn borrow_mut(&mut self) -> &mut T
fn borrow_mut(&mut self) -> &mut T
source§impl<T> Instrument for T
impl<T> Instrument for T
source§fn instrument(self, span: Span) -> Instrumented<Self>
fn instrument(self, span: Span) -> Instrumented<Self>
source§fn in_current_span(self) -> Instrumented<Self>
fn in_current_span(self) -> Instrumented<Self>
source§impl<T> IntoEither for T
impl<T> IntoEither for T
source§fn into_either(self, into_left: bool) -> Either<Self, Self>
fn into_either(self, into_left: bool) -> Either<Self, Self>
self
into a Left
variant of Either<Self, Self>
if into_left
is true
.
Converts self
into a Right
variant of Either<Self, Self>
otherwise. Read moresource§fn into_either_with<F>(self, into_left: F) -> Either<Self, Self>
fn into_either_with<F>(self, into_left: F) -> Either<Self, Self>
self
into a Left
variant of Either<Self, Self>
if into_left(&self)
returns true
.
Converts self
into a Right
variant of Either<Self, Self>
otherwise. Read more