Skip to main content

Module rope_multi

Module rope_multi 

Source
Expand description

Multi-section Rotary Position Embedding with optional interleaved mode.

Used by Qwen3.5 / Qwen3.6 full-attention layers (ADR-013 Decision 10). Both MROPE (mode = 8) and IMROPE (mode = 40) share a kernel; only the sector-to-axis mapping differs.

§Spec (summary)

For every pair p ∈ [0, rope_dim/2):

  1. sector = p mod (s0 + s1 + s2 + s3)
  2. Pick axis based on mode:
    • Mrope: contiguous sections — sector ∈ [0, s0) → axis 0, etc.
    • Imrope: sector % 3 cycling — sector % 3 == 0 && sector < 3*s0 → axis 0; == 1 && sector < 3*s1 → axis 1; == 2 && sector < 3*s2 → axis 2; else axis 3.
  3. theta = position[axis] * freq_base^(-2p/rope_dim)
  4. Rotate pair (x[p], x[p + head_dim/2]) by that angle (NeoX indexing).

Pairs p ≥ rope_dim/2 pass through unchanged (partial-rotary-factor).

§Positions layout

The positions buffer is an int32 array of length 4 * seq_len: first seq_len entries are the time-axis positions, next seq_len are the height-axis, then width, then extra. For Qwen3.5 text, all four axes are set to the token’s 1D position.

Structs§

RopeMultiBufferPack
Pre-built triple of small parameter buffers for a rope_multi dispatch.
RopeMultiParams
Shape + config for a rope_multi dispatch.

Enums§

RopeMultiMode
MROPE variant. Wire-level values match the ggml GGML_ROPE_TYPE_* enum.

Statics§

ROPE_MULTI_SHADER_SOURCE

Functions§

build_rope_multi_buffers
Convenience: build all three small parameter buffers given a RopeMultiParams.
clear_rope_pack_cache
Clear the thread-local rope-multi pack cache.
dispatch_rope_multi
Dispatch a rope_multi operation.
dispatch_rope_multi_cached
Dispatch a rope_multi operation, reusing pre-built parameter buffers from the per-thread cache.
register
rope_pack_cache_len
Inspect the current pack-cache size — diagnostic only.