Module quantize

Expand description

f64 → f32 quantization utilities for packed export.

When converting a trained SGBT (f64 precision) to the packed format (f32), thresholds and leaf values are quantized. This module provides validation to ensure the precision loss is acceptable.

Constants§

DEFAULT_TOLERANCE: Maximum acceptable absolute difference between f64 and f32 representations.

Functions§

max_quantization_error: Compute the maximum absolute quantization error across a slice of f64 values.
quantize_leaf: Quantize a leaf value with learning rate baked in.
quantize_threshold: Quantize an f64 threshold to f32, returning the f32 value.
within_tolerance: Check whether quantizing value to f32 stays within tolerance.

Module quantize

Module quantize Copy item path

Constants§

Functions§

Module quantize