pub struct ChunkAssembler { /* private fields */ }Expand description
Bridges arbitrary-length row batches onto the fixed chunk partition.
A streaming row source (gam_sae::corpus) yields batches whose
lengths are set by I/O policy (batch size, shard boundaries) — they do
not align with the deterministic chunk partition the accumulation tree
is keyed on. This assembler buffers incoming rows and submits exact chunks
in order, so the resulting Gram is bit-identical to having sliced the
partition directly: the batching of the producer can never leak into the
bits.
Checkpointing is exposed at chunk granularity only:
ChunkAssembler::checkpoint returns Some exactly when the internal
buffer is empty (a chunk boundary), because buffered raw rows are not part
of the accumulation state contract — a resumed pass re-reads its row
stream from the checkpointed chunk cursor
(StreamingBorderGram::chunk_rows of the frontier names the next row).
Implementations§
Source§impl ChunkAssembler
impl ChunkAssembler
Sourcepub fn new(
border_dim: usize,
n_rows: usize,
chunk_size: usize,
) -> Result<Self, String>
pub fn new( border_dim: usize, n_rows: usize, chunk_size: usize, ) -> Result<Self, String>
New assembler over the same partition parameters as
StreamingBorderGram::new.
Sourcepub fn push_rows(&mut self, rows: ArrayView2<'_, f64>) -> Result<(), String>
pub fn push_rows(&mut self, rows: ArrayView2<'_, f64>) -> Result<(), String>
Append a batch of rows (any length, including empty) in stream order, submitting every chunk the buffer completes.
Sourcepub fn checkpoint(&self) -> Option<BorderGramCheckpoint>
pub fn checkpoint(&self) -> Option<BorderGramCheckpoint>
Serialize the accumulation state — only at a chunk boundary. None
while rows are buffered mid-chunk (checkpoint after the next boundary,
or size batches to the chunk size for checkpoint-every-batch).
Sourcepub fn resume(state: BorderGramCheckpoint) -> Result<Self, String>
pub fn resume(state: BorderGramCheckpoint) -> Result<Self, String>
Resume an assembler at the chunk boundary a checkpoint names. The
caller re-positions its row stream at row
checkpoint.frontier * checkpoint.chunk_size (the partition is pure,
so that index is exact) and replays from there.
Auto Trait Implementations§
impl Freeze for ChunkAssembler
impl RefUnwindSafe for ChunkAssembler
impl Send for ChunkAssembler
impl Sync for ChunkAssembler
impl Unpin for ChunkAssembler
impl UnsafeUnpin for ChunkAssembler
impl UnwindSafe for ChunkAssembler
Blanket Implementations§
impl<T> Allocation for T
Source§impl<T> BorrowMut<T> for Twhere
T: ?Sized,
impl<T> BorrowMut<T> for Twhere
T: ?Sized,
Source§fn borrow_mut(&mut self) -> &mut T
fn borrow_mut(&mut self) -> &mut T
impl<ST, DT> CastableFrom<ST, Initialized, Initialized> for DT
impl<ST, DT> CastableFrom<ST, Uninit, Uninit> for DT
Source§impl<T> DistributionExt for Twhere
T: ?Sized,
impl<T> DistributionExt for Twhere
T: ?Sized,
impl<T, U> Imply<T> for U
Source§impl<T> IntoEither for T
impl<T> IntoEither for T
Source§fn into_either(self, into_left: bool) -> Either<Self, Self>
fn into_either(self, into_left: bool) -> Either<Self, Self>
self into a Left variant of Either<Self, Self>
if into_left is true.
Converts self into a Right variant of Either<Self, Self>
otherwise. Read moreSource§fn into_either_with<F>(self, into_left: F) -> Either<Self, Self>
fn into_either_with<F>(self, into_left: F) -> Either<Self, Self>
self into a Left variant of Either<Self, Self>
if into_left(&self) returns true.
Converts self into a Right variant of Either<Self, Self>
otherwise. Read moreSource§impl<T> Pointable for T
impl<T> Pointable for T
impl<T> Read<Exclusive, BecauseExclusive> for Twhere
T: ?Sized,
Source§impl<SS, SP> SupersetOf<SS> for SPwhere
SS: SubsetOf<SP>,
impl<SS, SP> SupersetOf<SS> for SPwhere
SS: SubsetOf<SP>,
Source§fn to_subset(&self) -> Option<SS>
fn to_subset(&self) -> Option<SS>
self from the equivalent element of its
superset. Read moreSource§fn is_in_subset(&self) -> bool
fn is_in_subset(&self) -> bool
self is actually part of its subset T (and can be converted to it).Source§fn to_subset_unchecked(&self) -> SS
fn to_subset_unchecked(&self) -> SS
self.to_subset but without any property checks. Always succeeds.Source§fn from_subset(element: &SS) -> SP
fn from_subset(element: &SS) -> SP
self to the equivalent element of its superset.