This library provides the floating bar type, which allows for efficient representation of rational numbers without loss of precision. It is based on this blog post.
The floating bar types follow a general structure:
- the sign bit
- the denominator size field (always log_2 of the type's total size)
- the fraction field (uses all the remaining bits)
More concretely, they are represented like this:
s = sign, d = denominator size, f = fraction field r32: sdddddffffffffffffffffffffffffff r64: sddddddfffffffffffffffffffffffffffffffffffffffffffffffffffffffff
The fraction field stores both the numerator and the denominator as separate values. Their exact size at any given moment depends on the size field, which gives the position of the partition (the "bar") from the right between the two values.
The denominator has an implicit 1 bit which goes in front of the actual value stored. Thus, a size field of zero has an implicit denominator value of 1, making it compatible with integers.
There can also be subnormal values. When the denominator takes up the whole fraction field (i.e. when the value of the size field equals the number of bits the fraction field has), the numerator will take an implicit value of 1.
Unfortunately, it's possible to have invalid values with this format. Invalid
values are those which have a denominator size larger than the number of bits in
the fraction field, and are represented as
NaN. For example, the default
constant provided for
r32 in this crate has a denominator size of 31, and the
rest of the bits set to zero.
To avoid headaches similar to those caused by floating-point arithmetic, this library focuses on the numeric value of the format and greatly limits the propagation of NaNs. Any operation that could give an undefined value (e.g. when overflowing or dividing by zero) will panic instead of returning a NaN. Effort is put in to not clobber possible payload values in NaNs, but no guarantees about their preservation are made. NaNs should mostly only occur when parsing a string with a value of "NaN".
An error which can be returned when parsing a rational number.
The 32-bit floating bar type.
The 64-bit floating bar type.