Crate float8

Source
Expand description

Eight bit floating point types in Rust.

This crate provides 2 types:

  • F8E4M3: Sign + 4-bit exponent + 3-bit mantissa. More precise but less dynamic range.
  • F8E5M2: Sign + 5-bit exponent + 2-bit mantissa. Less precise but more dynamic range (same exponent as f16).

Generally, this crate is modelled after the half crate, so it can be used alongside and with minimal code changes.

§Serialization

When the serde feature is enabled, F8E4M3 and F8E5M2 will be serialized as a newtype of u16 by default. In binary formats this is ideal, as it will generally use just two bytes for storage. For string formats like JSON, however, this isn’t as useful, and due to design limitations of serde, it’s not possible for the default Serialize implementation to support different serialization for different formats.

It is up to the container type of the floats to control how it is serialized. This can easily be controlled when using the derive macros using #[serde(serialize_with="")] attributes. For both F8E4M3 and F8E5M2, a serialize_as_f32 and serialize_as_string are provided for use with this attribute.

Deserialization of both float types supports deserializing from the default serialization, strings, and f32/f64 values, so no additional work is required.

§Cargo Features

This crate supports a number of optional cargo features. None of these features are enabled by default, even std.

Structs§

F8E4M3
Eight bit floating point type with 4-bit exponent and 3-bit mantissa.
F8E5M2
Eight bit floating point type with 5-bit exponent and 2-bit mantissa.