Struct PreparedGroupedGemm

Source

pub struct PreparedGroupedGemm<'a, T>where
    T: Element,
{ /* private fields */ }

Expand description

A GroupedGemmPlan bound to a concrete set of per-group problems.

Owns a PinnedBuffer<u8> holding the packed metadata (problem sizes, pointer arrays, leading dimensions). Pinned host memory is what makes the H2D inside run truly async — and therefore safely capturable into a CUDA graph. Owns no device memory; the caller supplies that via Workspace::Borrowed at run time.

§Lifetime contract

PreparedGroupedGemm extracts raw device pointers from the input GroupedProblem slice during prepare and stores them in pinned memory — it does not hold a Rust borrow on the input buffers afterwards. This is required for stream capture: the captured graph references the pinned buffer (for the metadata H2D) and the device buffers (via the pointer arrays) by raw address, not by Rust lifetime. The caller must therefore keep both this PreparedGroupedGemm and the underlying device buffers alive for as long as any captured graph that references them is in use.

In practice the pattern is: build groups, call prepare, capture into a graph, then keep PreparedGroupedGemm plus the input/output device buffers alive for the lifetime of the captured graph.

Struct PreparedGroupedGemm Copy item path

§Lifetime contract

Implementations§

impl<'a, T> PreparedGroupedGemm<'a, T>where T: Element,

pub fn workspace_size(&self) -> usize

pub fn sku(&self) -> GemmSku

pub fn group_count(&self) -> usize

pub fn run( &self, stream: &Stream, workspace: Workspace<'_>, ) -> Result<(), Error>

Trait Implementations§

impl<'a, T> Debug for PreparedGroupedGemm<'a, T>where T: Debug + Element,

fn fmt(&self, f: &mut Formatter<'_>) -> Result<(), Error>

Auto Trait Implementations§

impl<'a, T> Freeze for PreparedGroupedGemm<'a, T>

impl<'a, T> RefUnwindSafe for PreparedGroupedGemm<'a, T>where T: RefUnwindSafe,

impl<'a, T> Send for PreparedGroupedGemm<'a, T>where T: Send + Sync,

impl<'a, T> Sync for PreparedGroupedGemm<'a, T>where T: Sync,

impl<'a, T> Unpin for PreparedGroupedGemm<'a, T>where T: Unpin,

impl<'a, T> UnsafeUnpin for PreparedGroupedGemm<'a, T>

impl<'a, T> UnwindSafe for PreparedGroupedGemm<'a, T>where T: UnwindSafe + RefUnwindSafe,

Blanket Implementations§

impl<T> Any for Twhere T: 'static + ?Sized,

fn type_id(&self) -> TypeId

impl<T> Borrow<T> for Twhere T: ?Sized,

fn borrow(&self) -> &T

impl<T> BorrowMut<T> for Twhere T: ?Sized,

fn borrow_mut(&mut self) -> &mut T

impl<T> From<T> for T

fn from(t: T) -> T

impl<T, U> Into<U> for Twhere U: From<T>,

fn into(self) -> U

impl<T, U> TryFrom<U> for Twhere U: Into<T>,

type Error = Infallible

fn try_from(value: U) -> Result<T, <T as TryFrom<U>>::Error>

impl<T, U> TryInto<U> for Twhere U: TryFrom<T>,

type Error = <U as TryFrom<T>>::Error

fn try_into(self) -> Result<U, <U as TryFrom<T>>::Error>

Struct PreparedGroupedGemm

impl<'a, T> PreparedGroupedGemm<'a, T>
where T: Element,

impl<'a, T> Debug for PreparedGroupedGemm<'a, T>
where T: Debug + Element,

impl<'a, T> RefUnwindSafe for PreparedGroupedGemm<'a, T>
where T: RefUnwindSafe,

impl<'a, T> Send for PreparedGroupedGemm<'a, T>
where T: Send + Sync,

impl<'a, T> Sync for PreparedGroupedGemm<'a, T>
where T: Sync,

impl<'a, T> Unpin for PreparedGroupedGemm<'a, T>
where T: Unpin,

impl<'a, T> UnwindSafe for PreparedGroupedGemm<'a, T>
where T: UnwindSafe + RefUnwindSafe,

impl<T> Any for T
where T: 'static + ?Sized,

impl<T> Borrow<T> for T
where T: ?Sized,

impl<T> BorrowMut<T> for T
where T: ?Sized,

impl<T, U> Into<U> for T
where U: From<T>,

impl<T, U> TryFrom<U> for T
where U: Into<T>,

impl<T, U> TryInto<U> for T
where U: TryFrom<T>,