pub struct DeviceResidentArrowWorkspace { /* private fields */ }Expand description
Upload-once workspace for the SAE data-fit Arrow-Schur inner iteration.
Implementations§
Source§impl DeviceResidentArrowWorkspace
impl DeviceResidentArrowWorkspace
pub fn new( shape: DeviceResidentArrowShape, target_x: Vec<f64>, basis_values: Vec<f64>, gate_activations: Vec<f64>, slabs: DeviceResidentArrowSlabs, ) -> Result<Self, DeviceResidentArrowError>
pub const fn shape(&self) -> DeviceResidentArrowShape
pub fn device_resident(&self) -> bool
pub fn resident_device_bytes(&self) -> usize
pub fn host_shadow_bytes(&self) -> usize
Sourcepub fn one_inner_iteration(
&self,
ridge_t: f64,
ridge_beta: f64,
) -> Result<DeviceResidentArrowStep, DeviceResidentArrowError>
pub fn one_inner_iteration( &self, ridge_t: f64, ridge_beta: f64, ) -> Result<DeviceResidentArrowStep, DeviceResidentArrowError>
Run one device-side Newton sequence. No CPU fallback is attempted here:
callers that want a reference path must call Self::cpu_reference_step.
Sourcepub fn cpu_reference_step(
&self,
ridge_t: f64,
ridge_beta: f64,
) -> Result<DeviceResidentArrowStep, DeviceResidentArrowError>
pub fn cpu_reference_step( &self, ridge_t: f64, ridge_beta: f64, ) -> Result<DeviceResidentArrowStep, DeviceResidentArrowError>
CPU reference for parity harnesses. This path is explicit and is never
called from Self::one_inner_iteration.
pub fn to_arrow_system(&self) -> ArrowSchurSystem
Sourcepub fn device_fit(
&self,
opts: &DeviceResidentInnerOptions,
) -> Result<DeviceResidentInnerOutcome, DeviceResidentArrowError>
pub fn device_fit( &self, opts: &DeviceResidentInnerOptions, ) -> Result<DeviceResidentInnerOutcome, DeviceResidentArrowError>
Run the full device-resident inner Newton loop. Routes the per-iteration
arrow solve through the GPU path; returns Unavailable when CUDA did not
admit the resident workload (callers wanting a CPU path use
Self::cpu_reference_fit).
Sourcepub fn device_reupload_fit(
&self,
opts: &DeviceResidentInnerOptions,
) -> Result<DeviceResidentInnerOutcome, DeviceResidentArrowError>
pub fn device_reupload_fit( &self, opts: &DeviceResidentInnerOptions, ) -> Result<DeviceResidentInnerOutcome, DeviceResidentArrowError>
The #1017 residency baseline: run the SAME inner Newton loop but compute
each per-iterate arrow step through solve_arrow_newton_step, which
re-packs/re-uploads D/B/g and re-runs the per-row POTRF + border
Schur factor on EVERY iterate. This is the “current re-uploading path”;
the bench divides Self::device_fit (resident) against it to isolate
the across-iteration residency speedup on one device, holding the host
control flow and the GPU factor kernels fixed.
Sourcepub fn cpu_reference_fit(
&self,
opts: &DeviceResidentInnerOptions,
) -> Result<DeviceResidentInnerOutcome, DeviceResidentArrowError>
pub fn cpu_reference_fit( &self, opts: &DeviceResidentInnerOptions, ) -> Result<DeviceResidentInnerOutcome, DeviceResidentArrowError>
CPU dense-reference inner loop. Bit-for-bit the same host arithmetic as
Self::device_fit except the per-iteration arrow solve uses the dense
reference factorisation; the parity harness asserts the two agree.
Sourcepub fn device_fit_outer_sequence(
&self,
base_gradient_overrides: &[(Vec<f64>, Vec<f64>)],
opts: &DeviceResidentInnerOptions,
) -> Result<OuterSequenceOutcome, DeviceResidentArrowError>
pub fn device_fit_outer_sequence( &self, base_gradient_overrides: &[(Vec<f64>, Vec<f64>)], opts: &DeviceResidentInnerOptions, ) -> Result<OuterSequenceOutcome, DeviceResidentArrowError>
Run a sequence of outer evaluations that SHARE one resident frame when the Hessian operator is unchanged across outers (#1017 deliverable 3).
Each entry of base_gradient_overrides is one outer evaluation’s base
gradient (g_t rows: n·d, g_β: p) — the only part of the bordered
quadratic that moves across outers at a frozen gate/basis frame. The
constant Hessian blocks ride the resident frame, which is built ONCE and
reused for every outer (frame builds are counted and returned so a caller
can assert the across-outer amortization actually fired: exactly one frame
build for an unchanged operator, regardless of how many outers run).
Returns one DeviceResidentInnerOutcome per outer plus the number of
resident-frame builds performed across the whole sweep. On a CPU-only host
returns Unavailable (callers wanting a host path use
Self::cpu_reference_outer_sequence).
Sourcepub fn cpu_reference_outer_sequence(
&self,
base_gradient_overrides: &[(Vec<f64>, Vec<f64>)],
opts: &DeviceResidentInnerOptions,
) -> Result<OuterSequenceOutcome, DeviceResidentArrowError>
pub fn cpu_reference_outer_sequence( &self, base_gradient_overrides: &[(Vec<f64>, Vec<f64>)], opts: &DeviceResidentInnerOptions, ) -> Result<OuterSequenceOutcome, DeviceResidentArrowError>
CPU-reference outer sequence: same host control flow as
Self::device_fit_outer_sequence but the per-iterate arrow solve uses
the dense reference factorisation. The parity harness asserts the device
across-outer sweep agrees with this per-outer-independent reference.
Auto Trait Implementations§
impl Freeze for DeviceResidentArrowWorkspace
impl RefUnwindSafe for DeviceResidentArrowWorkspace
impl Send for DeviceResidentArrowWorkspace
impl Sync for DeviceResidentArrowWorkspace
impl Unpin for DeviceResidentArrowWorkspace
impl UnsafeUnpin for DeviceResidentArrowWorkspace
impl UnwindSafe for DeviceResidentArrowWorkspace
Blanket Implementations§
impl<T> Allocation for T
Source§impl<T> BorrowMut<T> for Twhere
T: ?Sized,
impl<T> BorrowMut<T> for Twhere
T: ?Sized,
Source§fn borrow_mut(&mut self) -> &mut T
fn borrow_mut(&mut self) -> &mut T
impl<ST, DT> CastableFrom<ST, Initialized, Initialized> for DT
impl<ST, DT> CastableFrom<ST, Uninit, Uninit> for DT
Source§impl<T> DistributionExt for Twhere
T: ?Sized,
impl<T> DistributionExt for Twhere
T: ?Sized,
impl<T, U> Imply<T> for U
Source§impl<T> IntoEither for T
impl<T> IntoEither for T
Source§fn into_either(self, into_left: bool) -> Either<Self, Self>
fn into_either(self, into_left: bool) -> Either<Self, Self>
self into a Left variant of Either<Self, Self>
if into_left is true.
Converts self into a Right variant of Either<Self, Self>
otherwise. Read moreSource§fn into_either_with<F>(self, into_left: F) -> Either<Self, Self>
fn into_either_with<F>(self, into_left: F) -> Either<Self, Self>
self into a Left variant of Either<Self, Self>
if into_left(&self) returns true.
Converts self into a Right variant of Either<Self, Self>
otherwise. Read moreSource§impl<T> Pointable for T
impl<T> Pointable for T
impl<T> Read<Exclusive, BecauseExclusive> for Twhere
T: ?Sized,
Source§impl<SS, SP> SupersetOf<SS> for SPwhere
SS: SubsetOf<SP>,
impl<SS, SP> SupersetOf<SS> for SPwhere
SS: SubsetOf<SP>,
Source§fn to_subset(&self) -> Option<SS>
fn to_subset(&self) -> Option<SS>
self from the equivalent element of its
superset. Read moreSource§fn is_in_subset(&self) -> bool
fn is_in_subset(&self) -> bool
self is actually part of its subset T (and can be converted to it).Source§fn to_subset_unchecked(&self) -> SS
fn to_subset_unchecked(&self) -> SS
self.to_subset but without any property checks. Always succeeds.Source§fn from_subset(element: &SS) -> SP
fn from_subset(element: &SS) -> SP
self to the equivalent element of its superset.