pub struct AsyncPrefetchDataset { /* private fields */ }Expand description
A streaming dataset with async prefetch for parallel I/O.
Unlike StreamingDataset which reads
synchronously, AsyncPrefetchDataset spawns a background task that reads
batches into a channel, allowing the main thread to process while I/O
happens.
§Example
ⓘ
use alimentar::async_prefetch::AsyncPrefetchDataset;
#[tokio::main]
async fn main() {
let dataset = AsyncPrefetchDataset::from_parquet("data.parquet", 1024, 4)
.await
.unwrap();
while let Some(batch) = dataset.next().await {
println!("Processing batch with {} rows", batch.num_rows());
}
}Implementations§
Source§impl AsyncPrefetchDataset
impl AsyncPrefetchDataset
Sourcepub fn new(source: Box<dyn DataSource>, prefetch_size: usize) -> Self
pub fn new(source: Box<dyn DataSource>, prefetch_size: usize) -> Self
Creates a new async prefetch dataset from a data source.
§Arguments
source- The data source to read fromprefetch_size- Number of batches to buffer ahead
Sourcepub fn from_parquet(
path: impl AsRef<Path>,
batch_size: usize,
prefetch_size: usize,
) -> Result<Self>
pub fn from_parquet( path: impl AsRef<Path>, batch_size: usize, prefetch_size: usize, ) -> Result<Self>
Sourcepub async fn next(&mut self) -> Option<Result<RecordBatch>>
pub async fn next(&mut self) -> Option<Result<RecordBatch>>
Receives the next batch asynchronously.
Returns None when the source is exhausted.
Sourcepub fn try_next(&mut self) -> Option<Result<RecordBatch>>
pub fn try_next(&mut self) -> Option<Result<RecordBatch>>
Tries to receive a batch without waiting.
Returns None if no batch is available or the source is exhausted.
Sourcepub fn buffered_count(&self) -> usize
pub fn buffered_count(&self) -> usize
Returns the number of batches currently buffered.
Trait Implementations§
Auto Trait Implementations§
impl Freeze for AsyncPrefetchDataset
impl RefUnwindSafe for AsyncPrefetchDataset
impl Send for AsyncPrefetchDataset
impl Sync for AsyncPrefetchDataset
impl Unpin for AsyncPrefetchDataset
impl UnsafeUnpin for AsyncPrefetchDataset
impl UnwindSafe for AsyncPrefetchDataset
Blanket Implementations§
Source§impl<T> BorrowMut<T> for Twhere
T: ?Sized,
impl<T> BorrowMut<T> for Twhere
T: ?Sized,
Source§fn borrow_mut(&mut self) -> &mut T
fn borrow_mut(&mut self) -> &mut T
Mutably borrows from an owned value. Read more
Source§impl<T> Instrument for T
impl<T> Instrument for T
Source§fn instrument(self, span: Span) -> Instrumented<Self>
fn instrument(self, span: Span) -> Instrumented<Self>
Source§fn in_current_span(self) -> Instrumented<Self>
fn in_current_span(self) -> Instrumented<Self>
Source§impl<T> IntoEither for T
impl<T> IntoEither for T
Source§fn into_either(self, into_left: bool) -> Either<Self, Self>
fn into_either(self, into_left: bool) -> Either<Self, Self>
Converts
self into a Left variant of Either<Self, Self>
if into_left is true.
Converts self into a Right variant of Either<Self, Self>
otherwise. Read moreSource§fn into_either_with<F>(self, into_left: F) -> Either<Self, Self>
fn into_either_with<F>(self, into_left: F) -> Either<Self, Self>
Converts
self into a Left variant of Either<Self, Self>
if into_left(&self) returns true.
Converts self into a Right variant of Either<Self, Self>
otherwise. Read moreCreates a shared type from an unshared type.