pub struct RopeTrainParams {
pub batch: u32,
pub n_heads: u32,
pub seq_len: u32,
pub head_dim: u32,
pub rope_dim: u32,
pub theta_base: f32,
pub sections: [u32; 4],
}Expand description
Shape + frequency parameters for a differentiable RoPE dispatch.
Non-IMROPE (plain NeoX RoPE) is expressed as sections = [head_dim/2, 0, 0, 0]
with mode = Imrope — all pairs fall into axis 0 (text time-axis) which is
the only axis used. Alternatively, callers can use rope_multi directly
with mode = Mrope and sections = [rope_dim/2, 0, 0, 0].
Fields§
§batch: u32§n_heads: u32Number of query/key heads.
seq_len: u32§head_dim: u32Full head dimension (must be even).
rope_dim: u32Number of dimensions that participate in rotation (≤ head_dim, even).
Pairs [rope_dim/2, head_dim/2) pass through unchanged.
theta_base: f32Base frequency (theta). Qwen3.5/3.6: 1_000_000.0 = 1e6.
Note: the metal shader comment in test_rope_multi.rs line 347 uses
1e7; the Qwen3.5 model config uses rope_theta = 1_000_000 = 1e6.
The caller MUST pass the value that matches the model’s GGUF
<prefix>.rope.freq_base key.
sections: [u32; 4]Section counts [s0, s1, s2, s3] for IMROPE / MROPE.
Qwen3.5 / Qwen3.6: [11, 11, 10, 0] (IMROPE, matches
/opt/hf2q/src/inference/models/qwen35/mod.rs:235).
Sum s0+s1+s2+s3 should equal rope_dim / 2 for full rotary-section
coverage. The kernel tolerates sums smaller than rope_dim/2
(sectors wrap modulo the sum), but callers should pass the canonical
value from the model config.
For non-IMROPE plain NeoX: [rope_dim/2, 0, 0, 0] with MROPE mode
puts every pair in axis-0 (time).
Trait Implementations§
Source§impl Clone for RopeTrainParams
impl Clone for RopeTrainParams
Source§fn clone(&self) -> RopeTrainParams
fn clone(&self) -> RopeTrainParams
1.0.0 (const: unstable) · Source§fn clone_from(&mut self, source: &Self)
fn clone_from(&mut self, source: &Self)
source. Read more