pub struct DequantizePerGroupPlan<TIn: Element, TOut: IntElement> { /* private fields */ }Expand description
dequantize_per_group plan.
x[..., g] = scale[outer, g] * (q[..., g] - zero_point[outer, g])
with g the group index along the (rightmost) quant axis.
Inverse of QuantizePerGroupPlan.
When to use: FP recovery from INT4/INT8 grouped weight blobs
(GPTQ / AWQ / GGML). Pair with
DequantizePerGroupBackwardPlan.
Dtypes: input int {s8, u8}; output FP {f32, f64, f16, bf16}.
Shape limits: rank-2 [outer, axis_size] with
axis_size % group_size == 0.
Workspace: none.
Precision guarantee: deterministic, bit-stable.
Implementations§
Source§impl<TIn: Element, TOut: IntElement> DequantizePerGroupPlan<TIn, TOut>
impl<TIn: Element, TOut: IntElement> DequantizePerGroupPlan<TIn, TOut>
Sourcepub fn select(
_stream: &Stream,
desc: &DequantizePerGroupDescriptor,
_pref: PlanPreference,
) -> Result<Self>
pub fn select( _stream: &Stream, desc: &DequantizePerGroupDescriptor, _pref: PlanPreference, ) -> Result<Self>
Pick a kernel for desc.
Sourcepub fn can_implement(
&self,
args: &DequantizePerGroupArgs<'_, TIn, TOut>,
) -> Result<()>
pub fn can_implement( &self, args: &DequantizePerGroupArgs<'_, TIn, TOut>, ) -> Result<()>
Validate args.
Sourcepub fn workspace_size(&self) -> usize
pub fn workspace_size(&self) -> usize
Workspace bytes — none.
Sourcepub fn precision_guarantee(&self) -> PrecisionGuarantee
pub fn precision_guarantee(&self) -> PrecisionGuarantee
Numerical guarantees.
Auto Trait Implementations§
impl<TIn, TOut> Freeze for DequantizePerGroupPlan<TIn, TOut>
impl<TIn, TOut> RefUnwindSafe for DequantizePerGroupPlan<TIn, TOut>where
TIn: RefUnwindSafe,
TOut: RefUnwindSafe,
impl<TIn, TOut> Send for DequantizePerGroupPlan<TIn, TOut>
impl<TIn, TOut> Sync for DequantizePerGroupPlan<TIn, TOut>
impl<TIn, TOut> Unpin for DequantizePerGroupPlan<TIn, TOut>
impl<TIn, TOut> UnsafeUnpin for DequantizePerGroupPlan<TIn, TOut>
impl<TIn, TOut> UnwindSafe for DequantizePerGroupPlan<TIn, TOut>where
TIn: UnwindSafe,
TOut: UnwindSafe,
Blanket Implementations§
Source§impl<T> BorrowMut<T> for Twhere
T: ?Sized,
impl<T> BorrowMut<T> for Twhere
T: ?Sized,
Source§fn borrow_mut(&mut self) -> &mut T
fn borrow_mut(&mut self) -> &mut T
Mutably borrows from an owned value. Read more