1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
//! A crate that provides support for half-precision 16-bit floating point types.
//!
//! This crate provides the [`f16`] type, which is an implementation of the IEEE 754-2008 standard
//! [`binary16`] a.k.a "half" floating point type. This 16-bit floating point type is intended for
//! efficient storage where the full range and precision of a larger floating point value is not
//! required. This is especially useful for image storage formats.
//!
//! This crate also provides a [`bf16`] type, an alternative 16-bit floating point format. The
//! [`bfloat16`] format is a truncated IEEE 754 standard `binary32` float that preserves the
//! exponent to allow the same range as [`f32`] but with only 8 bits of precision (instead of 11
//! bits for [`f16`]). See the [`bf16`] type for details.
//!
//! Because [`f16`] and [`bf16`] are primarily for efficient storage, floating point operations such
//! as addition, multiplication, etc. are not always implemented by hardware. When hardware does not
//! support these operations, this crate emulates them by converting the value to
//! [`f32`] before performing the operation and then back afterward.
//!
//! Note that conversion from [`f32`]/[`f64`] to both [`f16`] and [`bf16`] are lossy operations, and
//! just as converting a [`f64`] to [`f32`] is lossy and does not have `Into`/`From` trait
//! implementations, so too do these smaller types not have those trait implementations either.
//! Instead, use `from_f32`/`from_f64` functions for the types in this crate. If you don't care
//! about lossy conversions and need trait conversions, use the appropriate [`num-traits`]
//! traits that are implemented.
//!
//! This crate also provides a [`slice`][mod@slice] module for zero-copy in-place conversions of
//! [`u16`] slices to both [`f16`] and [`bf16`], as well as efficient vectorized conversions of
//! larger buffers of floating point values to and from these half formats.
//!
//! The crate supports `#[no_std]` when the `std` cargo feature is not enabled, so can be used in
//! embedded environments without using the Rust [`std`] library. The `std` feature enables support
//! for the standard library and is enabled by default, see the [Cargo Features](#cargo-features)
//! section below.
//!
//! A [`prelude`] module is provided for easy importing of available utility traits.
//!
//! # Serialization
//!
//! When the `serde` feature is enabled, [`f16`] and [`bf16`] will be serialized as a newtype of
//! [`u16`] by default. In binary formats this is ideal, as it will generally use just two bytes for
//! storage. For string formats like JSON, however, this isn't as useful, and due to design
//! limitations of serde, it's not possible for the default `Serialize` implementation to support
//! different serialization for different formats.
//!
//! Instead, it's up to the containter type of the floats to control how it is serialized. This can
//! easily be controlled when using the derive macros using `#[serde(serialize_with="")]`
//! attributes. For both [`f16`] and [`bf16`] a `serialize_as_f32` and `serialize_as_string` are
//! provided for use with this attribute.
//!
//! Deserialization of both float types supports deserializing from the default serialization,
//! strings, and `f32`/`f64` values, so no additional work is required.
//!
//! # Hardware support
//!
//! Hardware support for these conversions and arithmetic will be used
//! whenever hardware support is available—either through instrinsics or targeted assembly—although
//! a nightly Rust toolchain may be required for some hardware. When hardware supports it the
//! functions and traits in the [`slice`][mod@slice] and [`vec`] modules will also use vectorized
//! SIMD intructions for increased efficiency.
//!
//! The following list details hardware support for floating point types in this crate. When using
//! `std` cargo feature, runtime CPU target detection will be used. To get the most performance
//! benefits, compile for specific CPU features which avoids the runtime overhead and works in a
//! `no_std` environment.
//!
//! | Architecture | CPU Target Feature | Notes |
//! | ------------ | ------------------ | ----- |
//! | `x86`/`x86_64` | `f16c` | This supports conversion to/from [`f16`] only (including vector SIMD) and does not support any [`bf16`] or arithmetic operations. |
//! | `aarch64` | `fp16` | This supports all operations on [`f16`] only. |
//!
//! # Cargo Features
//!
//! This crate supports a number of optional cargo features. None of these features are enabled by
//! default, even `std`.
//!
//! - **`alloc`** — Enable use of the [`alloc`] crate when not using the `std` library.
//!
//!   Among other functions, this enables the [`vec`] module, which contains zero-copy
//!   conversions for the [`Vec`] type. This allows fast conversion between raw `Vec<u16>` bits and
//!   `Vec<f16>` or `Vec<bf16>` arrays, and vice versa.
//!
//! - **`std`** — Enable features that depend on the Rust [`std`] library. This also enables the
//!   `alloc` feature automatically.
//!
//!   Enabling the `std` feature enables runtime CPU feature detection of hardware support.
//!   Without this feature detection, harware is only used when compiler target supports them.
//!
//! - **`serde`** — Adds support for the [`serde`] crate by implementing [`Serialize`] and
//!   [`Deserialize`] traits for both [`f16`] and [`bf16`].
//!
//! - **`num-traits`** — Adds support for the [`num-traits`] crate by implementing [`ToPrimitive`],
//!   [`FromPrimitive`], [`AsPrimitive`], [`Num`], [`Float`], [`FloatCore`], and [`Bounded`] traits
//!   for both [`f16`] and [`bf16`].
//!
//! - **`bytemuck`** — Adds support for the [`bytemuck`] crate by implementing [`Zeroable`] and
//!   [`Pod`] traits for both [`f16`] and [`bf16`].
//!
//! - **`zerocopy`** — Adds support for the [`zerocopy`] crate by implementing [`AsBytes`] and
//!   [`FromBytes`] traits for both [`f16`] and [`bf16`].
//!
//! - **`rand_distr`** — Adds support for the [`rand_distr`] crate by implementing [`Distribution`]
//!   and other traits for both [`f16`] and [`bf16`].
//!
//! - **`rkyv`** -- Enable zero-copy deserializtion with [`rkyv`] crate.
//!
//! [`alloc`]: https://doc.rust-lang.org/alloc/
//! [`std`]: https://doc.rust-lang.org/std/
//! [`binary16`]: https://en.wikipedia.org/wiki/Half-precision_floating-point_format
//! [`bfloat16`]: https://en.wikipedia.org/wiki/Bfloat16_floating-point_format
//! [`serde`]: https://crates.io/crates/serde
//! [`bytemuck`]: https://crates.io/crates/bytemuck
//! [`num-traits`]: https://crates.io/crates/num-traits
//! [`zerocopy`]: https://crates.io/crates/zerocopy
//! [`rand_distr`]: https://crates.io/crates/rand_distr
//! [`rkyv`]: (https://crates.io/crates/rkyv)
#![cfg_attr(
    feature = "alloc",
    doc = "
[`vec`]: mod@vec"
)]
#![cfg_attr(
    not(feature = "alloc"),
    doc = "
[`vec`]: #
[`Vec`]: https://docs.rust-lang.org/stable/alloc/vec/struct.Vec.html"
)]
#![cfg_attr(
    feature = "serde",
    doc = "
[`Serialize`]: serde::Serialize
[`Deserialize`]: serde::Deserialize"
)]
#![cfg_attr(
    not(feature = "serde"),
    doc = "
[`Serialize`]: https://docs.rs/serde/*/serde/trait.Serialize.html
[`Deserialize`]: https://docs.rs/serde/*/serde/trait.Deserialize.html"
)]
#![cfg_attr(
    feature = "num-traits",
    doc = "
[`ToPrimitive`]: ::num_traits::ToPrimitive
[`FromPrimitive`]: ::num_traits::FromPrimitive
[`AsPrimitive`]: ::num_traits::AsPrimitive
[`Num`]: ::num_traits::Num
[`Float`]: ::num_traits::Float
[`FloatCore`]: ::num_traits::float::FloatCore
[`Bounded`]: ::num_traits::Bounded"
)]
#![cfg_attr(
    not(feature = "num-traits"),
    doc = "
[`ToPrimitive`]: https://docs.rs/num-traits/*/num_traits/cast/trait.ToPrimitive.html
[`FromPrimitive`]: https://docs.rs/num-traits/*/num_traits/cast/trait.FromPrimitive.html
[`AsPrimitive`]: https://docs.rs/num-traits/*/num_traits/cast/trait.AsPrimitive.html
[`Num`]: https://docs.rs/num-traits/*/num_traits/trait.Num.html
[`Float`]: https://docs.rs/num-traits/*/num_traits/float/trait.Float.html
[`FloatCore`]: https://docs.rs/num-traits/*/num_traits/float/trait.FloatCore.html
[`Bounded`]: https://docs.rs/num-traits/*/num_traits/bounds/trait.Bounded.html"
)]
#![cfg_attr(
    feature = "bytemuck",
    doc = "
[`Zeroable`]: bytemuck::Zeroable
[`Pod`]: bytemuck::Pod"
)]
#![cfg_attr(
    not(feature = "bytemuck"),
    doc = "
[`Zeroable`]: https://docs.rs/bytemuck/*/bytemuck/trait.Zeroable.html
[`Pod`]: https://docs.rs/bytemuck/*bytemuck/trait.Pod.html"
)]
#![cfg_attr(
    feature = "zerocopy",
    doc = "
[`AsBytes`]: zerocopy::AsBytes
[`FromBytes`]: zerocopy::FromBytes"
)]
#![cfg_attr(
    not(feature = "zerocopy"),
    doc = "
[`AsBytes`]: https://docs.rs/zerocopy/*/zerocopy/trait.AsBytes.html
[`FromBytes`]: https://docs.rs/zerocopy/*/zerocopy/trait.FromBytes.html"
)]
#![cfg_attr(
    feature = "rand_distr",
    doc = "
[`Distribution`]: rand::distributions::Distribution"
)]
#![cfg_attr(
    not(feature = "rand_distr"),
    doc = "
[`Distribution`]: https://docs.rs/rand/*/rand/distributions/trait.Distribution.html"
)]
#![warn(
    missing_docs,
    missing_copy_implementations,
    trivial_numeric_casts,
    future_incompatible
)]
#![cfg_attr(not(target_arch = "spirv"), warn(missing_debug_implementations))]
#![allow(clippy::verbose_bit_mask, clippy::cast_lossless)]
#![cfg_attr(not(feature = "std"), no_std)]
#![doc(html_root_url = "https://docs.rs/half/2.4.1")]
#![doc(test(attr(deny(warnings), allow(unused))))]
#![cfg_attr(docsrs, feature(doc_auto_cfg))]

#[cfg(feature = "alloc")]
extern crate alloc;

mod bfloat;
mod binary16;
mod leading_zeros;
#[cfg(feature = "num-traits")]
mod num_traits;

#[cfg(not(target_arch = "spirv"))]
pub mod slice;
#[cfg(feature = "alloc")]
pub mod vec;

pub use bfloat::bf16;
pub use binary16::f16;

#[cfg(feature = "rand_distr")]
mod rand_distr;

/// A collection of the most used items and traits in this crate for easy importing.
///
/// # Examples
///
/// ```rust
/// use half::prelude::*;
/// ```
pub mod prelude {
    #[doc(no_inline)]
    pub use crate::{bf16, f16};

    #[cfg(not(target_arch = "spirv"))]
    #[doc(no_inline)]
    pub use crate::slice::{HalfBitsSliceExt, HalfFloatSliceExt};

    #[cfg(feature = "alloc")]
    #[doc(no_inline)]
    pub use crate::vec::{HalfBitsVecExt, HalfFloatVecExt};
}

// Keep this module private to crate
mod private {
    use crate::{bf16, f16};

    pub trait SealedHalf {}

    impl SealedHalf for f16 {}
    impl SealedHalf for bf16 {}
}