pub struct ExecutorAttentionConfig {
pub attention_type: AttentionType,
pub enable_flash_attention: bool,
pub enable_paged_attention: bool,
pub block_size: Option<usize>,
pub sliding_window_size: Option<usize>,
}Expand description
Runtime attention configuration for model executor
Note: This is different from ferrum_types::AttentionConfig which describes the model architecture’s attention configuration from config.json. This type describes the runtime execution settings.
Fields§
§attention_type: AttentionTypeType of attention to use
enable_flash_attention: boolEnable flash attention if available
enable_paged_attention: boolEnable paged attention
block_size: Option<usize>Block size for paged attention
sliding_window_size: Option<usize>Sliding window size (if using sliding window attention)
Trait Implementations§
Source§impl Clone for ExecutorAttentionConfig
impl Clone for ExecutorAttentionConfig
Source§fn clone(&self) -> ExecutorAttentionConfig
fn clone(&self) -> ExecutorAttentionConfig
Returns a duplicate of the value. Read more
1.0.0 · Source§fn clone_from(&mut self, source: &Self)
fn clone_from(&mut self, source: &Self)
Performs copy-assignment from
source. Read moreSource§impl Debug for ExecutorAttentionConfig
impl Debug for ExecutorAttentionConfig
Source§impl<'de> Deserialize<'de> for ExecutorAttentionConfig
impl<'de> Deserialize<'de> for ExecutorAttentionConfig
Source§fn deserialize<__D>(__deserializer: __D) -> Result<Self, __D::Error>where
__D: Deserializer<'de>,
fn deserialize<__D>(__deserializer: __D) -> Result<Self, __D::Error>where
__D: Deserializer<'de>,
Deserialize this value from the given Serde deserializer. Read more
Auto Trait Implementations§
impl Freeze for ExecutorAttentionConfig
impl RefUnwindSafe for ExecutorAttentionConfig
impl Send for ExecutorAttentionConfig
impl Sync for ExecutorAttentionConfig
impl Unpin for ExecutorAttentionConfig
impl UnsafeUnpin for ExecutorAttentionConfig
impl UnwindSafe for ExecutorAttentionConfig
Blanket Implementations§
Source§impl<T> BorrowMut<T> for Twhere
T: ?Sized,
impl<T> BorrowMut<T> for Twhere
T: ?Sized,
Source§fn borrow_mut(&mut self) -> &mut T
fn borrow_mut(&mut self) -> &mut T
Mutably borrows from an owned value. Read more