MXFP4Block

Struct MXFP4Block 

Source
pub struct MXFP4Block { /* private fields */ }
Expand description

A compressed block of 32 F4E2M1 values with a shared E8M0 scale factor.

§Overview

MXFP4Block implements the MXFP4 block format from the OCP MX specification, designed for efficient storage and computation in machine learning applications. This format achieves 4x compression by:

  • Storing 32 4-bit F4E2M1 values in just 16 bytes
  • Using a single shared E8M0 scale factor for the entire block
  • Enabling vectorized operations on compressed data

§Format Details

The block consists of:

  • Data: 16 bytes containing 32 packed F4E2M1 values (2 per byte)
  • Scale: 1 byte E8M0 scale factor (power of two from 2^-127 to 2^127)

Total size: 17 bytes for 32 values (compared to 128 bytes for f32)

§Use Cases

MXFP4Block is particularly useful for:

  • Neural network weight compression
  • Activation storage in quantized models
  • Gradient accumulation in low-precision training
  • Memory-bandwidth limited applications

§Examples

§Creating from F4E2M1 values

use float4::{F4E2M1, E8M0, MXFP4Block};

// Create 32 F4E2M1 values
let mut values = [F4E2M1::from_f64(0.0); 32];
for i in 0..32 {
    values[i] = F4E2M1::from_f64((i as f64) * 0.1);
}

// Create scale factor
let scale = E8M0::from(1.0);

// Pack into block
let block = MXFP4Block::from_f32_slice(values, scale);

// Convert back to f32
let f32_values = block.to_f32_array();

§Quantizing f32 data

use float4::{F4E2M1, E8M0, MXFP4Block};

// Original f32 data
let f32_data = [1.5, 2.0, -0.5, 3.0, /* ... 28 more values ... */];

// Compute appropriate scale
let scale = E8M0::from_f32_slice(&f32_data[..]);
let scale_val = scale.to_f64();

// Quantize to F4E2M1
let mut quantized = [F4E2M1::from_f64(0.0); 32];
for i in 0..32 {
    if i < f32_data.len() {
        quantized[i] = F4E2M1::from_f64(f32_data[i] as f64 / scale_val);
    }
}

// Create compressed block
let block = MXFP4Block::from_f32_slice(quantized, scale);

§Memory Layout

The 16-byte data array packs values as follows:

Byte 0: [Value1 bits 0-3][Value0 bits 0-3]
Byte 1: [Value3 bits 0-3][Value2 bits 0-3]
...
Byte 15: [Value31 bits 0-3][Value30 bits 0-3]

Each byte contains two 4-bit values, with even-indexed values in the lower nibble and odd-indexed values in the upper nibble.

Implementations§

Source§

impl MXFP4Block

Source

pub fn from_f32_slice(xs: [F4E2M1; 32], scale: E8M0) -> Self

Creates a new MXFP4Block from pre-quantized F4E2M1 values and a scale factor.

This function packs 32 F4E2M1 values into a compressed block format. The values should already be quantized to F4E2M1 format and the scale should be chosen appropriately for the data range.

§Arguments
  • xs - Array of exactly 32 F4E2M1 values to pack
  • scale - E8M0 scale factor that will be applied when unpacking
§Packing Details

Values are packed two per byte in little-endian nibble order:

  • Even indices (0, 2, 4, …) go in the lower 4 bits
  • Odd indices (1, 3, 5, …) go in the upper 4 bits
§Examples
use float4::{F4E2M1, E8M0, MXFP4Block};

// Create normalized values in [-6, 6] range
let mut values = [F4E2M1::from_f64(0.0); 32];
for i in 0..32 {
    values[i] = F4E2M1::from_f64(((i as f64) - 16.0) / 4.0);
}

// Pack with scale factor of 8
let scale = E8M0::from(8.0);
let block = MXFP4Block::from_f32_slice(values, scale);
Source

pub fn to_f4_array(&self) -> [F4E2M1; 32]

Unpacks the compressed block into individual F4E2M1 values.

This extracts the 32 packed F4E2M1 values from the compressed format, returning them as an array. The scale factor is not applied - the returned values are exactly as stored in the block.

§Returns

An array of 32 F4E2M1 values in the same order they were packed.

§Examples
use float4::{F4E2M1, E8M0, MXFP4Block};

let values = [F4E2M1::from_f64(1.5); 32];
let block = MXFP4Block::from_f32_slice(values, E8M0::from(1.0));

let unpacked = block.to_f4_array();
assert_eq!(unpacked[0].to_f64(), 1.5);
Source

pub fn scale(&self) -> E8M0

Returns the E8M0 scale factor associated with this block.

The scale factor is a power of two that should be multiplied with the F4E2M1 values to recover the original data range.

§Examples
use float4::{F4E2M1, E8M0, MXFP4Block};

let block = MXFP4Block::from_f32_slice(
    [F4E2M1::from_f64(1.0); 32],
    E8M0::from(16.0)
);
assert_eq!(block.scale().to_f64(), 16.0);
Source

pub fn to_f32_array(&self) -> [f32; 32]

Converts the block to an array of f32 values by applying the scale factor.

This method unpacks all F4E2M1 values and multiplies each by the block’s scale factor, producing the final decompressed values. This is the typical way to retrieve usable floating-point data from an MXFP4Block.

§Returns

An array of 32 f32 values computed as: F4E2M1_value * scale_factor

§Examples
use float4::{F4E2M1, E8M0, MXFP4Block};

// Create block with values [0.5, 1.0, 1.5, ...] and scale 4.0
let mut values = [F4E2M1::from_f64(0.0); 32];
values[0] = F4E2M1::from_f64(0.5);
values[1] = F4E2M1::from_f64(1.0);
values[2] = F4E2M1::from_f64(1.5);

let block = MXFP4Block::from_f32_slice(values, E8M0::from(4.0));
let f32_array = block.to_f32_array();

assert_eq!(f32_array[0], 2.0);  // 0.5 * 4.0
assert_eq!(f32_array[1], 4.0);  // 1.0 * 4.0
assert_eq!(f32_array[2], 6.0);  // 1.5 * 4.0
§Precision Considerations

The conversion involves:

  1. F4E2M1 → f64 (exact)
  2. Multiplication by scale (exact for power-of-two scales)
  3. f64 → f32 (may round)

For maximum precision, consider working with f64 if your scale factors and values require it.

Trait Implementations§

Source§

impl Clone for MXFP4Block

Source§

fn clone(&self) -> MXFP4Block

Returns a duplicate of the value. Read more
1.0.0 · Source§

fn clone_from(&mut self, source: &Self)

Performs copy-assignment from source. Read more
Source§

impl Debug for MXFP4Block

Source§

fn fmt(&self, f: &mut Formatter<'_>) -> Result

Formats the value using the given formatter. Read more
Source§

impl Hash for MXFP4Block

Source§

fn hash<__H: Hasher>(&self, state: &mut __H)

Feeds this value into the given Hasher. Read more
1.3.0 · Source§

fn hash_slice<H>(data: &[Self], state: &mut H)
where H: Hasher, Self: Sized,

Feeds a slice of this type into the given Hasher. Read more
Source§

impl PartialEq for MXFP4Block

Source§

fn eq(&self, other: &MXFP4Block) -> bool

Tests for self and other values to be equal, and is used by ==.
1.0.0 · Source§

fn ne(&self, other: &Rhs) -> bool

Tests for !=. The default implementation is almost always sufficient, and should not be overridden without very good reason.
Source§

impl Copy for MXFP4Block

Source§

impl Eq for MXFP4Block

Source§

impl StructuralPartialEq for MXFP4Block

Auto Trait Implementations§

Blanket Implementations§

Source§

impl<T> Any for T
where T: 'static + ?Sized,

Source§

fn type_id(&self) -> TypeId

Gets the TypeId of self. Read more
Source§

impl<T> Borrow<T> for T
where T: ?Sized,

Source§

fn borrow(&self) -> &T

Immutably borrows from an owned value. Read more
Source§

impl<T> BorrowMut<T> for T
where T: ?Sized,

Source§

fn borrow_mut(&mut self) -> &mut T

Mutably borrows from an owned value. Read more
Source§

impl<T> CloneToUninit for T
where T: Clone,

Source§

unsafe fn clone_to_uninit(&self, dest: *mut u8)

🔬This is a nightly-only experimental API. (clone_to_uninit)
Performs copy-assignment from self to dest. Read more
Source§

impl<T> From<T> for T

Source§

fn from(t: T) -> T

Returns the argument unchanged.

Source§

impl<T, U> Into<U> for T
where U: From<T>,

Source§

fn into(self) -> U

Calls U::from(self).

That is, this conversion is whatever the implementation of From<T> for U chooses to do.

Source§

impl<T> ToOwned for T
where T: Clone,

Source§

type Owned = T

The resulting type after obtaining ownership.
Source§

fn to_owned(&self) -> T

Creates owned data from borrowed data, usually by cloning. Read more
Source§

fn clone_into(&self, target: &mut T)

Uses borrowed data to replace owned data, usually by cloning. Read more
Source§

impl<T, U> TryFrom<U> for T
where U: Into<T>,

Source§

type Error = Infallible

The type returned in the event of a conversion error.
Source§

fn try_from(value: U) -> Result<T, <T as TryFrom<U>>::Error>

Performs the conversion.
Source§

impl<T, U> TryInto<U> for T
where U: TryFrom<T>,

Source§

type Error = <U as TryFrom<T>>::Error

The type returned in the event of a conversion error.
Source§

fn try_into(self) -> Result<U, <U as TryFrom<T>>::Error>

Performs the conversion.