Skip to main content

parallel_attention_compute

Function parallel_attention_compute 

Source
pub async fn parallel_attention_compute(
    config: ParallelConfig,
    queries: Vec<Float32Array>,
    keys: Vec<Vec<Float32Array>>,
    values: Vec<Vec<Float32Array>>,
) -> Result<BatchResult>
Expand description

Parallel attention computation across multiple queries