pub fn pack_vertical(values: &[u32; 128], bit_width: u8, output: &mut Vec<u8>)Expand description
Pack 128 integers using true vertical bit-interleaved layout
Vertical layout stores bit i of all 128 integers together in 16 consecutive bytes. Total size: exactly 128 * bit_width / 8 bytes (no padding waste)
This layout is optimal for SIMD unpacking: a single 16-byte load retrieves one bit position from all 128 integers simultaneously.