This library provides the floating-bar type, which allows for efficient representation of rational numbers without loss of precision. It is based on Inigo Quilez’s blog post exploring the concept.
use floating_bar::r32; let fullscreen = r32!(4 / 3); let widescreen = r32!(16 / 9); assert_eq!(fullscreen, r32!(800 / 600)); assert_eq!(widescreen, r32!(1280 / 720)); assert_eq!(widescreen, r32!(1920 / 1080));
The floating-bar types follow a general structure:
- the denominator-size field: always log2 of the type’s total size, stored in the highest bits.
- the fraction field: stored in the remaining bits.
Here is a visual aid, where each character corresponds to one bit and the least significant bit is on the right:
d = denominator size field, f = fraction field r32: dddddfffffffffffffffffffffffffff r64: ddddddffffffffffffffffffffffffffffffffffffffffffffffffffffffffff
The fraction field stores both the numerator and the denominator. The size of the denominator is determined by the denominator-size field, which gives the position of the partition (the “fraction bar”) from the right.
The numerator is stored as a two’s complement signed integer on the left side of the partition. The denominator is stored as an unsigned integer on the right side. The denominator has an implicit 1 bit in front of its stored value, similar to the implicit bit convention followed by floating-point. Thus, a size field of zero has an implicit denominator value of 1.
There are three distinct categories that a floating-bar number can fall into: normal, reducible, and not-a-number (also known as NaNs).
NaN values are those with a denominator size greater than or equal to the
size of the entire fraction field. The library mostly ignores these values, and
only uses one particular value to provide a
NAN constant. They can be used to
store payloads if desired using the
Effort is put into not clobbering possible payload values, but no guarantees are
Reducible values are those where the numerator and denominator share some common factor that has not been canceled out, and thus take up more space than their normalized form. Due to the performance cost of finding and canceling out common factors, reducible values are only normalized when absolutely necessary, such as when the result would otherwise overflow.
Normal values are those where the numerator and denominator don’t share any common factors, and could not be any smaller while still accurately representing its value.
Equality is performed by the following rules:
- If both numbers are NaN, they are equal.
- If only one of the numbers is NaN, they are not equal.
- Otherwise, both values are normalized and their raw representations are checked for equality.
Comparison is performed by the following rules:
- If both numbers are NaN, they compare equal.
- If only one number is NaN, they’re incomparable.
- Otherwise, the values are calculated into order-preserving integers which are then compared.
Note that floating-bar numbers only implement
PartialOrd and not
Ord due to the (currently) unspecified ordering of NaNs. This may change in the future.
The algorithm for converting a floating-point number to a floating-bar number is described by John D. Cook’s Best Rational Approximation post, with some minor tweaks to improve accuracy and performance. The algorithm splits the space provided for the fraction into two for the numerator and denominator, and then repeatedly calculates an upper and lower bound for the number until it finds the closest approximation that will fit in that space.
Converting from floats in practice has shown to be accurate up to about 7 decimal digits.
Convenience macro for
Convenience macro for
The 32-bit floating bar type.
The 64-bit floating bar type.
An error which can be returned when parsing a ratio.