Struct polars::prelude::ParquetReader
source · pub struct ParquetReader<R>where
R: Read + Seek,{ /* private fields */ }
polars-io
only.Expand description
Read Apache parquet format into a DataFrame.
Implementations§
source§impl<R> ParquetReader<R>where
R: MmapBytesReader,
impl<R> ParquetReader<R>where R: MmapBytesReader,
sourcepub fn set_low_memory(self, low_memory: bool) -> ParquetReader<R>
pub fn set_low_memory(self, low_memory: bool) -> ParquetReader<R>
Try to reduce memory pressure at the expense of performance. If setting this does not reduce memory enough, turn off parallelization.
sourcepub fn read_parallel(self, parallel: ParallelStrategy) -> ParquetReader<R>
pub fn read_parallel(self, parallel: ParallelStrategy) -> ParquetReader<R>
Read the parquet file in parallel (default). The single threaded reader consumes less memory.
sourcepub fn with_n_rows(self, num_rows: Option<usize>) -> ParquetReader<R>
pub fn with_n_rows(self, num_rows: Option<usize>) -> ParquetReader<R>
Stop parsing when n
rows are parsed. By settings this parameter the csv will be parsed
sequentially.
sourcepub fn with_columns(self, columns: Option<Vec<String>>) -> ParquetReader<R>
pub fn with_columns(self, columns: Option<Vec<String>>) -> ParquetReader<R>
Columns to select/ project
sourcepub fn with_projection(self, projection: Option<Vec<usize>>) -> ParquetReader<R>
pub fn with_projection(self, projection: Option<Vec<usize>>) -> ParquetReader<R>
Set the reader’s column projection. This counts from 0, meaning that
vec![0, 4]
would select the 1st and 5th column.
sourcepub fn with_row_count(self, row_count: Option<RowCount>) -> ParquetReader<R>
pub fn with_row_count(self, row_count: Option<RowCount>) -> ParquetReader<R>
Add a row_count
column.
sourcepub fn with_schema(self, schema: Option<Arc<ArrowSchema>>) -> ParquetReader<R>
pub fn with_schema(self, schema: Option<Arc<ArrowSchema>>) -> ParquetReader<R>
Set the Schema
if already known. This must be exactly the same as
the schema in the file itself.
sourcepub fn schema(&mut self) -> Result<Arc<ArrowSchema>, PolarsError>
pub fn schema(&mut self) -> Result<Arc<ArrowSchema>, PolarsError>
Schema
of the file.
sourcepub fn use_statistics(self, toggle: bool) -> ParquetReader<R>
pub fn use_statistics(self, toggle: bool) -> ParquetReader<R>
Use statistics in the parquet to determine if pages can be skipped from reading.
sourcepub fn num_rows(&mut self) -> Result<usize, PolarsError>
pub fn num_rows(&mut self) -> Result<usize, PolarsError>
Number of rows in the parquet file.
pub fn with_hive_partition_columns( self, columns: Option<Vec<Series>> ) -> ParquetReader<R>
pub fn get_metadata(&mut self) -> Result<&Arc<FileMetaData>, PolarsError>
pub fn with_predicate( self, predicate: Option<Arc<dyn PhysicalIoExpr>> ) -> ParquetReader<R>
source§impl<R> ParquetReader<R>where
R: MmapBytesReader + 'static,
impl<R> ParquetReader<R>where R: MmapBytesReader + 'static,
pub fn batched( self, chunk_size: usize ) -> Result<BatchedParquetReader, PolarsError>
Trait Implementations§
source§impl<R> SerReader<R> for ParquetReader<R>where
R: MmapBytesReader,
impl<R> SerReader<R> for ParquetReader<R>where R: MmapBytesReader,
source§fn new(reader: R) -> ParquetReader<R>
fn new(reader: R) -> ParquetReader<R>
Create a new ParquetReader
from an existing Reader
.