Skip to main content

ChunkAssembler

Struct ChunkAssembler 

Source
pub struct ChunkAssembler { /* private fields */ }
Expand description

Bridges arbitrary-length row batches onto the fixed chunk partition.

A streaming row source (gam_sae::corpus) yields batches whose lengths are set by I/O policy (batch size, shard boundaries) — they do not align with the deterministic chunk partition the accumulation tree is keyed on. This assembler buffers incoming rows and submits exact chunks in order, so the resulting Gram is bit-identical to having sliced the partition directly: the batching of the producer can never leak into the bits.

Checkpointing is exposed at chunk granularity only: ChunkAssembler::checkpoint returns Some exactly when the internal buffer is empty (a chunk boundary), because buffered raw rows are not part of the accumulation state contract — a resumed pass re-reads its row stream from the checkpointed chunk cursor (StreamingBorderGram::chunk_rows of the frontier names the next row).

Implementations§

Source§

impl ChunkAssembler

Source

pub fn new( border_dim: usize, n_rows: usize, chunk_size: usize, ) -> Result<Self, String>

New assembler over the same partition parameters as StreamingBorderGram::new.

Source

pub fn push_rows(&mut self, rows: ArrayView2<'_, f64>) -> Result<(), String>

Append a batch of rows (any length, including empty) in stream order, submitting every chunk the buffer completes.

Source

pub fn checkpoint(&self) -> Option<BorderGramCheckpoint>

Serialize the accumulation state — only at a chunk boundary. None while rows are buffered mid-chunk (checkpoint after the next boundary, or size batches to the chunk size for checkpoint-every-batch).

Source

pub fn resume(state: BorderGramCheckpoint) -> Result<Self, String>

Resume an assembler at the chunk boundary a checkpoint names. The caller re-positions its row stream at row checkpoint.frontier * checkpoint.chunk_size (the partition is pure, so that index is exact) and replays from there.

Source

pub fn finish(self) -> Result<Array2<f64>, String>

Finish the pass. Errors if the stream ended mid-chunk or short of the declared row count — a truncated stream is rejected loudly, never folded as a silently shorter corpus.

Auto Trait Implementations§

Blanket Implementations§

Source§

impl<T> Allocation for T
where T: RefUnwindSafe + Send + Sync,

Source§

impl<T> Any for T
where T: 'static + ?Sized,

Source§

fn type_id(&self) -> TypeId

Gets the TypeId of self. Read more
Source§

impl<T> Borrow<T> for T
where T: ?Sized,

Source§

fn borrow(&self) -> &T

Immutably borrows from an owned value. Read more
Source§

impl<T> BorrowMut<T> for T
where T: ?Sized,

Source§

fn borrow_mut(&mut self) -> &mut T

Mutably borrows from an owned value. Read more
Source§

impl<T> ByRef<T> for T

Source§

fn by_ref(&self) -> &T

Source§

impl<ST, DT> CastableFrom<ST, Initialized, Initialized> for DT
where ST: ?Sized, DT: ?Sized,

Source§

impl<ST, DT> CastableFrom<ST, Uninit, Uninit> for DT
where ST: ?Sized, DT: ?Sized,

Source§

impl<T> DistributionExt for T
where T: ?Sized,

Source§

fn rand<T>(&self, rng: &mut (impl Rng + ?Sized)) -> T
where Self: Distribution<T>,

Source§

impl<T> From<T> for T

Source§

fn from(t: T) -> T

Returns the argument unchanged.

Source§

impl<T, U> Imply<T> for U
where T: ?Sized, U: ?Sized,

Source§

impl<T, U> Into<U> for T
where U: From<T>,

Source§

fn into(self) -> U

Calls U::from(self).

That is, this conversion is whatever the implementation of From<T> for U chooses to do.

Source§

impl<T> IntoEither for T

Source§

fn into_either(self, into_left: bool) -> Either<Self, Self>

Converts self into a Left variant of Either<Self, Self> if into_left is true. Converts self into a Right variant of Either<Self, Self> otherwise. Read more
Source§

fn into_either_with<F>(self, into_left: F) -> Either<Self, Self>
where F: FnOnce(&Self) -> bool,

Converts self into a Left variant of Either<Self, Self> if into_left(&self) returns true. Converts self into a Right variant of Either<Self, Self> otherwise. Read more
Source§

impl<T> Pointable for T

Source§

const ALIGN: usize

The alignment of pointer.
Source§

type Init = T

The type for initializers.
Source§

unsafe fn init(init: <T as Pointable>::Init) -> usize

Initializes a with the given initializer. Read more
Source§

unsafe fn deref<'a>(ptr: usize) -> &'a T

Dereferences the given pointer. Read more
Source§

unsafe fn deref_mut<'a>(ptr: usize) -> &'a mut T

Mutably dereferences the given pointer. Read more
Source§

unsafe fn drop(ptr: usize)

Drops the object pointed to by the given pointer. Read more
Source§

impl<T> Read<Exclusive, BecauseExclusive> for T
where T: ?Sized,

Source§

impl<T> Same for T

Source§

type Output = T

Should always be Self
Source§

impl<SS, SP> SupersetOf<SS> for SP
where SS: SubsetOf<SP>,

Source§

fn to_subset(&self) -> Option<SS>

The inverse inclusion map: attempts to construct self from the equivalent element of its superset. Read more
Source§

fn is_in_subset(&self) -> bool

Checks if self is actually part of its subset T (and can be converted to it).
Source§

fn to_subset_unchecked(&self) -> SS

Use with care! Same as self.to_subset but without any property checks. Always succeeds.
Source§

fn from_subset(element: &SS) -> SP

The inclusion map: converts self to the equivalent element of its superset.
Source§

impl<T, U> TryFrom<U> for T
where U: Into<T>,

Source§

type Error = Infallible

The type returned in the event of a conversion error.
Source§

fn try_from(value: U) -> Result<T, <T as TryFrom<U>>::Error>

Performs the conversion.
Source§

impl<T, U> TryInto<U> for T
where U: TryFrom<T>,

Source§

type Error = <U as TryFrom<T>>::Error

The type returned in the event of a conversion error.
Source§

fn try_into(self) -> Result<U, <U as TryFrom<T>>::Error>

Performs the conversion.
Source§

impl<V, T> VZip<V> for T
where V: MultiLane<T>,

Source§

fn vzip(self) -> V