Crate vax_floating

source ·
Expand description

§vax-floating - VAX Floating-Point Types

This is a Rust implementation of the VAX floating-point types documented in the VAX Architecture Reference Manual.

  • Supports conversion from rust data types.
  • Supports conversion from rust data types to constants.
  • Supports conversion between vax floating types (both constant and runtime).
  • Supports standard mathematical operators.
  • Supports constant mathematical operators.
  • Supports display, and lowercase and uppercase exponential output.

§Features

proptest - Include support for the proptest testing crate.

§Supported VAX floating-Point Types

VAX TypeSizeExponent sizeExponent range
F_floating32-bits8-bits2127 to 2-127
D_floating64-bits8-bits2127 to 2-127
G_floating64-bits11-bits21,023 to 2-1,023
H_floating128-bits15-bits216,383 to 2-16,383

§Examples

use vax_floating::{FFloating, DFloating, GFloating, HFloating};
use std::str::FromStr;

// Supports conversion from rust data types.
let ten = FFloating::from(10_u8);
let three_hundred = DFloating::from(300_u16);
let twelve_point_five = GFloating::from(12.5_f32);
let very_small = HFloating::from_str("1e-1000").unwrap();

assert_eq!(ten, FFloating::from(10_u64));
assert_eq!(three_hundred, DFloating::from(300_u32));
assert_eq!(twelve_point_five, GFloating::from_str("12.5").unwrap());
assert_eq!(very_small, HFloating::from_u8(1) / HFloating::from_str("1e1000").unwrap());

// Supports conversion from rust data types to constants.
const TEN: FFloating = FFloating::from_u8(10);
const ONE_FIFTY: DFloating = DFloating::from_u16(150);
const PI: GFloating = GFloating::from_ascii("3.1415926535897932384626433832");
const MANY_ZEROES: HFloating = HFloating::from_u128(
    100_000_000_000_000_000_000_000_000_000_000u128);

assert_eq!(ten, TEN);
assert_eq!(ONE_FIFTY, DFloating::from_i32(150));
assert_eq!(PI, GFloating::from_f64(std::f64::consts::PI));
assert_eq!(MANY_ZEROES, HFloating::from_str("1.0e32").unwrap());

// Supports conversion between VAX floating point types
let ten_h = HFloating::from(ten);
let three_hundred_g = GFloating::from(three_hundred);
let twelve_point_five_f = FFloating::from(twelve_point_five);
let pi_d = DFloating::from(PI);

assert_eq!(ten_h, HFloating::from(10_u64));
assert_eq!(three_hundred_g, GFloating::from(300_u32));
assert_eq!(twelve_point_five_f, FFloating::from_str("12.5").unwrap());
assert_eq!(pi_d, DFloating::from_f64(std::f64::consts::PI));

// Supports conversion between VAX floating point types to constants
const TEN_G: GFloating = TEN.to_g_floating();
const ONE_FIFTY_H: HFloating = ONE_FIFTY.to_h_floating();
const PI_F: FFloating = PI.to_f_floating();
const MANY_ZEROES_D: DFloating = MANY_ZEROES.to_d_floating();

assert_eq!(TEN_G, GFloating::from_u8(10));
assert_eq!(ONE_FIFTY_H, HFloating::from_i32(150));
assert_eq!(PI_F, FFloating::from_f32(std::f32::consts::PI));
assert_eq!(MANY_ZEROES_D, DFloating::from_str("1.0e32").unwrap());

// Supports standard mathematical operators.
let one = TEN / ten;
let four_fifty = three_hundred + ONE_FIFTY;
let two_pi = PI * GFloating::from(2_i8);
let many_zeroes = MANY_ZEROES - very_small;

assert_eq!(one, FFloating::from_i128(1));
assert_eq!(four_fifty, DFloating::from(450_u64));
assert_eq!(two_pi, GFloating::from_f64(std::f64::consts::PI * 2.0));
assert_eq!(many_zeroes, MANY_ZEROES);

// Supports constant mathematical operators.
const TENTH: FFloating = FFloating::from_u8(1).divide_by(FFloating::from_u8(10));
const NEG_ONE_FIFTY: DFloating = DFloating::from_bits(0).subtract_by(ONE_FIFTY);
const TWO_PI: GFloating = PI.multiply_by(GFloating::from_i64(2));
const TWO_HUNDRED_NONILLION: HFloating = MANY_ZEROES.add_to(MANY_ZEROES);

assert_eq!(TENTH, FFloating::from_str("0.1").unwrap());
assert_eq!(NEG_ONE_FIFTY, -ONE_FIFTY);
assert_eq!(TWO_PI, two_pi);
assert_eq!(TWO_HUNDRED_NONILLION, HFloating::from_str("200,000,000,000,000,000,000,000,000,000,000").unwrap());

// Supports display, and lowercase and uppercase exponential output.
assert_eq!(&format!("{:.4}", TENTH), "0.1000");
assert_eq!(&format!("{}", PI), "3.141592653589793");
assert_eq!(&format!("{:e}", very_small), "1e-1000");
assert_eq!(&format!("{:.1E}", MANY_ZEROES), "1.0E32");
assert_eq!(&format!("{:.3e}", four_fifty), "4.500e2");
assert_eq!(&format!("{:.3}", four_fifty), "450.000");

§Error-Encoded Reserved

All VAX floating-point types have a reserved value that has an exponent of zero and the sign-bit set, and would trigger a reserved operand fault.

Whenever any operation (that doesn’t return a Result) creates a value that cannot be represented as the VAX floating-point type, it will be set to a reserved value, and the error type will be encoded into the fractional portion of the type.

The two most-significant bits in the fraction are used to indicate the error type. If the two bits are 00, then it is a divide-by-zero error; If the two bits are 01, it is an underflow error; If the two bits are 10, it is an overflow error; and of the two bits are 11, it is any other error.

For overflow and underflow errors, the value of the exponent that caused the overflow or underflow is placed in the most significant 16-bits that don’t contain the sign, exponent, or error bits. For FFloating, DFloating, and GFloating types, the exponent bits are bits 16-31, and for the HFloating type, the exponent bits are bits 32-47. If the exponent is out of range, then the exponent bits are set to 0. Because the error uses i32 as its type, and the high 16-bits are assumed by the error type, the range stored in the encoded reserved value is 1 through 65535 for overflow, and -1 through -65535 for underflow.

Due to this encoding, any VAX floting-point type can be converted into a Result.

use vax_floating::{FFloating, Error, Result};

let overflow = FFloating::from_f32(f32::MAX);
assert!(overflow.is_reserved());
assert_eq!(<Result<FFloating>>::from(overflow), Err(Error::Overflow(Some(128))));

§Floating-point Type Differences between VAX and IEEE 754

  • The VAX uses a unique byte ordering for the floating-point values. Each set of 16-bit values is in little endian order, but the 16-bit byte-pairs are in big-endian order. The first (lowest address) 16-bits of the VAX floating-point types contain the sign bit, the exponent, and, usually, the most significant bits of the fraction. The last (highest addressed) 16-bits of the VAX floating-point types contain the least significant bits of the fraction.
  • The VAX doesn’t support negative zero. An exponent value of zero with a sign bit of 1 is a reserved value and would trigger a reserved operand fault.
  • The VAX doesn’t support subnormal numbers. All values with a sign bit clear and a exponent value of zero are considered to be zero.
  • The VAX doesn’t have an Infinity value, which gives it one more exponent value.
  • The VAX exponent bias is 2 more than the ones used in IEEE 754. Since VAX doesn’t support an infinity state, it has symetrical exponent values. For example, the F_floating type has an exponent range from 127 to -127, whereas, the single-precision floating-point type defined in IEEE 754 has an exponent range from 128 to -125. (see note about differences between exponents referred to in this documentation and how it is referenced to by Wikipedia)
  • The VAX rounds differently than Rust. The VAX always rounds ties up, whereas, the f32 and f64 types round according to the roundTiesToEven direction defined in IEEE 754-2008.

§Notes

§Wikipedia Exponents

There is a difference between the exponent values in the Wikipedia reference documentation for IEEE 754, and exponent values in this documentation, the VAX documentation, and as defined in Rust as the MIN_EXP and MAX_EXP values in f32 and f64).

It comes down to how the implicit bit in the fraction portion of the floating-point is treated. In Wikipedia, the implicit bit is the least-significant non-fractional bit, and here it is the most-significant fractional bit.

On Wikipedia, the range for values with exponent 0 is ≥ 1.0 and < 2.0. Here, the range for exponent 0 is ≥ 0.5 and < 1.0. Therefore, our exponent 0 is equal to Wikipedia’s exponent -1.

Re-exports§

Modules§

Structs§