Struct lance_core::io::reader::FileReader
source · pub struct FileReader { /* private fields */ }Expand description
Lance File Reader.
It reads arrow data from one data file.
Implementations§
source§impl FileReader
impl FileReader
sourcepub async fn try_new_with_fragment(
object_store: &ObjectStore,
path: &Path,
fragment_id: u64,
manifest: Option<&Manifest>,
session: Option<&FileMetadataCache>
) -> Result<Self>
pub async fn try_new_with_fragment( object_store: &ObjectStore, path: &Path, fragment_id: u64, manifest: Option<&Manifest>, session: Option<&FileMetadataCache> ) -> Result<Self>
Open file reader
Open the file at the given path using the provided object store.
The passed fragment ID determines the first 32-bits of the row IDs.
If a manifest is passed in, it will be used to load the schema and dictionary. This is typically done if the file is part of a dataset fragment. If no manifest is passed in, then it is read from the file itself.
The session passed in is used to cache metadata about the file. If no session is passed in, there will be no caching.
sourcepub async fn try_new(object_store: &ObjectStore, path: &Path) -> Result<Self>
pub async fn try_new(object_store: &ObjectStore, path: &Path) -> Result<Self>
Open one Lance data file for read.
sourcepub fn with_row_id(&mut self, v: bool) -> &mut Self
pub fn with_row_id(&mut self, v: bool) -> &mut Self
Instruct the FileReader to return meta row id column.
sourcepub fn with_make_deletions_null(&mut self, val: bool) -> &mut Self
pub fn with_make_deletions_null(&mut self, val: bool) -> &mut Self
Instruct the FileReader that instead of removing deleted rows, it may simply mark the _rowid value as null. Some rows may still be removed, for example if the entire batch is deleted. This is a performance optimization where the null bitmap of the _rowid column serves as a selection vector.
pub fn num_batches(&self) -> usize
sourcepub fn num_rows_in_batch(&self, batch_id: i32) -> usize
pub fn num_rows_in_batch(&self, batch_id: i32) -> usize
Get the number of rows in this batch
pub fn is_empty(&self) -> bool
sourcepub async fn read_batch(
&self,
batch_id: i32,
params: impl Into<ReadBatchParams>,
projection: &Schema
) -> Result<RecordBatch>
pub async fn read_batch( &self, batch_id: i32, params: impl Into<ReadBatchParams>, projection: &Schema ) -> Result<RecordBatch>
Read a batch of data from the file.
The schema of the returned RecordBatch is set by FileReader::schema().
sourcepub async fn read_range(
&self,
range: Range<usize>,
projection: &Schema
) -> Result<RecordBatch>
pub async fn read_range( &self, range: Range<usize>, projection: &Schema ) -> Result<RecordBatch>
Read a range of records into one batch.
Note that it might call concat if the range is crossing multiple batches, which
makes it less efficient than FileReader::read_batch().
sourcepub async fn take(
&self,
indices: &[u32],
projection: &Schema
) -> Result<RecordBatch>
pub async fn take( &self, indices: &[u32], projection: &Schema ) -> Result<RecordBatch>
Take by records by indices within the file.
The indices must be sorted.
pub async fn read_page_stats( &self, projection: &Schema ) -> Result<Option<RecordBatch>>
Trait Implementations§
source§impl Clone for FileReader
impl Clone for FileReader
source§fn clone(&self) -> FileReader
fn clone(&self) -> FileReader
1.0.0 · source§fn clone_from(&mut self, source: &Self)
fn clone_from(&mut self, source: &Self)
source. Read more