Skip to main content

IKVCacheUpdateLayer

Struct IKVCacheUpdateLayer 

Source
pub struct IKVCacheUpdateLayer { /* private fields */ }
Expand description

IKVCacheUpdateLayer

Layer that represents a KVCacheUpdate operation.

The KVCacheUpdate layer is used to cache the key or value tensors for the attention mechanism. K and V use separate KVCacheUpdate layers.

An IKVCacheUpdateLayer has three inputs (cache, update, writeIndices) and one output. In kLINEAR mode, for each batch element i, the layer copies the update tensor into the cache starting at position writeIndices[i]. Assuming no out-of-bounds writes occur, the operation for each sequence position s in [0, sequenceLength) is:

output[i, :, writeIndices[i] + s, :] = update[i, :, s, :]

The output performs in-place updates on the cache tensor, so they must share the same device memory address.

Do not inherit from this class, as doing so will break forward-compatibility of the API and ABI.

Implementations§

Source§

impl IKVCacheUpdateLayer

Source

pub fn setCacheMode( self: Pin<&mut IKVCacheUpdateLayer>, cacheMode: KVCacheMode, ) -> bool

Set the mode of the KVCacheUpdate layer.

  • cacheMode The mode of the KVCacheUpdate layer. For TensorRT 10.15, only kLINEAR mode is supported.

True if cache mode is set successfully, false otherwise.

Source

pub fn getCacheMode(self: &IKVCacheUpdateLayer) -> KVCacheMode

Get the mode of the KVCacheUpdate layer.

The mode of the KVCacheUpdate layer.

Source

pub fn setUpdateForm( self: Pin<&mut IKVCacheUpdateLayer>, form: AttentionIOForm, ) -> bool

Set the update form.

Default is kPADDED_BHND. When set to kPACKED_NHD, the update tensor shape is [totalTokens, numHeads, dimHead] instead of [batchSize, numHeads, numTokens, dimHead], and the updateLengths tensor must be provided.

  • form The update form.

True if the update form is set successfully, false otherwise.

See [getUpdateForm()] See AttentionIOForm

Source

pub fn getUpdateForm(self: &IKVCacheUpdateLayer) -> AttentionIOForm

Get the update form.

The update form.

See [setUpdateForm()] See AttentionIOForm

Source

pub unsafe fn setUpdateLengths( self: Pin<&mut IKVCacheUpdateLayer>, lengths: *mut ITensor, ) -> bool

Set the update lengths tensor.

Only valid when the update form is kPACKED_NHD. Provides a cumulative token counts tensor with shape [batchSize + 1]. The first element should be 0 and the last element equals totalTokens. The number of tokens for batch i is lengths[i + 1] - lengths[i]. Must be set when update form is kPACKED_NHD.

Providing a first element that is not 0 results in undefined behavior.

  • lengths A 1D tensor of type kINT32 with shape [batchSize + 1], or nullptr to clear a previously set update lengths tensor.

True if the update lengths tensor is set successfully, false otherwise.

See [getUpdateLengths()]

Source

pub fn getUpdateLengths(self: &IKVCacheUpdateLayer) -> *mut ITensor

Get the update lengths tensor.

The update lengths tensor, or nullptr if not set.

See [setUpdateLengths()]

Trait Implementations§

Source§

impl AsLayer for IKVCacheUpdateLayer

Source§

fn as_layer(&self) -> &ILayer

Source§

fn as_layer_pin_mut(&mut self) -> Pin<&mut ILayer>

Source§

impl AsLayerTyped for IKVCacheUpdateLayer

Source§

const TYPE: LayerType = LayerType::kKVCACHE_UPDATE

Source§

impl AsRef<ILayer> for IKVCacheUpdateLayer

Source§

fn as_ref(self: &IKVCacheUpdateLayer) -> &ILayer

Converts this type into a shared reference of the (usually inferred) input type.
Source§

impl ExternType for IKVCacheUpdateLayer

Source§

type Id = (n, v, i, n, f, e, r, _1, (), I, K, V, C, a, c, h, e, U, p, d, a, t, e, L, a, y, e, r)

A type-level representation of the type’s C++ namespace and type name. Read more
Source§

type Kind = Opaque

Source§

impl MakeCppStorage for IKVCacheUpdateLayer

Source§

unsafe fn allocate_uninitialized_cpp_storage() -> *mut IKVCacheUpdateLayer

Allocates heap space for this type in C++ and return a pointer to that space, but do not initialize that space (i.e. do not yet call a constructor). Read more
Source§

unsafe fn free_uninitialized_cpp_storage(arg0: *mut IKVCacheUpdateLayer)

Frees a C++ allocation which has not yet had a constructor called. Read more

Auto Trait Implementations§

Blanket Implementations§

Source§

impl<T> Any for T
where T: 'static + ?Sized,

Source§

fn type_id(&self) -> TypeId

Gets the TypeId of self. Read more
Source§

impl<T> Borrow<T> for T
where T: ?Sized,

Source§

fn borrow(&self) -> &T

Immutably borrows from an owned value. Read more
Source§

impl<T> BorrowMut<T> for T
where T: ?Sized,

Source§

fn borrow_mut(&mut self) -> &mut T

Mutably borrows from an owned value. Read more
Source§

impl<T> From<T> for T

Source§

fn from(t: T) -> T

Returns the argument unchanged.

Source§

impl<T, U> Into<U> for T
where U: From<T>,

Source§

fn into(self) -> U

Calls U::from(self).

That is, this conversion is whatever the implementation of From<T> for U chooses to do.

Source§

impl<T, U> TryFrom<U> for T
where U: Into<T>,

Source§

type Error = Infallible

The type returned in the event of a conversion error.
Source§

fn try_from(value: U) -> Result<T, <T as TryFrom<U>>::Error>

Performs the conversion.
Source§

impl<T, U> TryInto<U> for T
where U: TryFrom<T>,

Source§

type Error = <U as TryFrom<T>>::Error

The type returned in the event of a conversion error.
Source§

fn try_into(self) -> Result<U, <U as TryFrom<T>>::Error>

Performs the conversion.