pub struct IKVCacheUpdateLayer { /* private fields */ }Expand description
IKVCacheUpdateLayer
Layer that represents a KVCacheUpdate operation.
The KVCacheUpdate layer is used to cache the key or value tensors for the attention mechanism. K and V use separate KVCacheUpdate layers.
An IKVCacheUpdateLayer has three inputs (cache, update, writeIndices) and one output.
In kLINEAR mode, for each batch element i, the layer copies the update tensor into the cache starting at
position writeIndices[i]. Assuming no out-of-bounds writes occur, the operation for each sequence position
s in [0, sequenceLength) is:
output[i, :, writeIndices[i] + s, :] = update[i, :, s, :]
The output performs in-place updates on the cache tensor, so they must share the same device memory address.
Do not inherit from this class, as doing so will break forward-compatibility of the API and ABI.
Implementations§
Source§impl IKVCacheUpdateLayer
impl IKVCacheUpdateLayer
Sourcepub fn setCacheMode(
self: Pin<&mut IKVCacheUpdateLayer>,
cacheMode: KVCacheMode,
) -> bool
pub fn setCacheMode( self: Pin<&mut IKVCacheUpdateLayer>, cacheMode: KVCacheMode, ) -> bool
Set the mode of the KVCacheUpdate layer.
cacheModeThe mode of the KVCacheUpdate layer. For TensorRT 10.15, onlykLINEARmode is supported.
True if cache mode is set successfully, false otherwise.
Sourcepub fn getCacheMode(self: &IKVCacheUpdateLayer) -> KVCacheMode
pub fn getCacheMode(self: &IKVCacheUpdateLayer) -> KVCacheMode
Get the mode of the KVCacheUpdate layer.
The mode of the KVCacheUpdate layer.