linear-srgb 0.6.3

Fast linear↔sRGB color space conversion with FMA acceleration and LUT support
Documentation

linear-srgb

Fast, SIMD-accelerated sRGB↔linear conversion for image processing pipelines.

Handles f32, f64, u8, and u16 data. Supports in-place RGBA with alpha preservation, fused premultiply/unpremultiply, custom gamma, and extended range. no_std compatible.

Crates.io Docs.rs License

[dependencies]
linear-srgb = "0.6"

# Optional: BT.709, PQ (HDR10), and HLG transfer functions
linear-srgb = { version = "0.6", features = ["transfer"] }

Quick Start

Use the slice functions. They're SIMD-accelerated (AVX-512, AVX2, SSE4.1, NEON, WASM SIMD128) with automatic runtime CPU dispatch — typically 4–16x faster than scalar loops.

use linear_srgb::default::*;

// f32 slices (in-place, SIMD-accelerated)
let mut values = vec![0.5f32; 10000];
srgb_to_linear_slice(&mut values);
linear_to_srgb_slice(&mut values);

For RGBA data, use the _rgba_ variants — they convert only the RGB channels and leave alpha untouched. This matters: alpha is linear by definition, so applying the sRGB transfer function to it is a bug.

use linear_srgb::default::*;

// RGBA f32 — alpha channel preserved, only RGB converted
let mut rgba = vec![0.5f32, 0.5, 0.5, 0.75, 1.0, 1.0, 1.0, 1.0];
srgb_to_linear_rgba_slice(&mut rgba);
assert_eq!(rgba[3], 0.75); // alpha untouched

Type conversions

Convert directly between integer sRGB and linear f32 without intermediate steps.

use linear_srgb::default::*;

// u8 sRGB → linear f32 (LUT-based, extremely fast)
let srgb_bytes: Vec<u8> = vec![128u8; 1024];
let mut linear = vec![0.0f32; 1024];
srgb_u8_to_linear_slice(&srgb_bytes, &mut linear);

// linear f32 → sRGB u8 (SIMD-accelerated)
let mut srgb_out = vec![0u8; 1024];
linear_to_srgb_u8_slice(&linear, &mut srgb_out);

// RGBA u8 → linear f32 (alpha passed through as a/255, not sRGB-decoded)
let rgba_bytes = vec![128u8, 128, 128, 200, 64, 64, 64, 128];
let mut rgba_linear = vec![0.0f32; 8];
srgb_u8_to_linear_rgba_slice(&rgba_bytes, &mut rgba_linear);

// u16 support too
let mut u16_linear = vec![0.0f32; 256];
let srgb_u16: Vec<u16> = (0..256).map(|i| (i * 256) as u16).collect();
srgb_u16_to_linear_slice(&srgb_u16, &mut u16_linear);

Premultiplied alpha (fused, single-pass)

Convert between sRGB straight-alpha and linear premultiplied alpha in one SIMD pass — no intermediate buffer, no second memory traversal.

use linear_srgb::default::*;

// sRGB straight → linear premultiplied (f32 in-place)
let mut rgba = vec![0.8f32, 0.5, 0.2, 0.75, 1.0, 1.0, 1.0, 1.0];
srgb_to_linear_premultiply_rgba_slice(&mut rgba);

// linear premultiplied → sRGB straight (f32 in-place)
unpremultiply_linear_to_srgb_rgba_slice(&mut rgba);

// Also available as u8→f32 and f32→u8:
// srgb_u8_to_linear_premultiply_rgba_slice(&srgb_bytes, &mut linear_premul);
// unpremultiply_linear_to_srgb_u8_rgba_slice(&linear_premul, &mut srgb_out);

Single values

When you only need one value at a time (not a batch):

use linear_srgb::default::*;

// f32 — rational polynomial (≤14 ULP max, perfectly monotonic)
let linear = srgb_to_linear(0.5f32);
let srgb = linear_to_srgb(linear);

// u8 — LUT-based, zero math
let linear = srgb_u8_to_linear(128u8);
let srgb_byte = linear_to_srgb_u8(linear);

// u16 — LUT-based
let linear = srgb_u16_to_linear(32768u16);
let srgb_u16 = linear_to_srgb_u16(linear);

Precise (powf) conversions

For maximum accuracy or extended-range (HDR/ICC) workflows:

use linear_srgb::precise::*;

// f32 — exact powf, C0-continuous (6 ULP max)
let linear = srgb_to_linear(0.5f32);
let srgb = linear_to_srgb(0.214f32);

// f64 high-precision
let linear = srgb_to_linear_f64(0.5f64);

// Extended range — no clamping, for cross-gamut / scRGB pipelines
let linear = srgb_to_linear_extended(-0.1);
let srgb = linear_to_srgb_extended(1.5);

Custom gamma

For pure power-law gamma (no linear toe segment) — gamma 2.2, 1.8, etc.:

use linear_srgb::default::*;

let linear = gamma_to_linear(0.5f32, 2.2);
let encoded = linear_to_gamma(linear, 2.2);

// SIMD-accelerated slices
let mut values = vec![0.5f32; 1000];
gamma_to_linear_slice(&mut values, 2.2);

// Fused premultiply/unpremultiply also available:
// gamma_to_linear_premultiply_rgba_slice(&mut rgba, 2.2);
// unpremultiply_linear_to_gamma_rgba_slice(&mut rgba, 2.2);

HDR transfer functions (transfer feature)

BT.709, PQ (ST 2084 / HDR10), and HLG (ARIB STD-B67) — scalar and SIMD.

linear-srgb = { version = "0.6", features = ["transfer"] }
use linear_srgb::default::*;

let linear = pq_to_linear(0.5);       // PQ (HDR10) → linear
let pq = linear_to_pq(linear);

let linear = hlg_to_linear(0.5);      // HLG → linear
let linear = bt709_to_linear(0.5);    // BT.709 → linear

LUT for custom bit depths

use linear_srgb::lut::{LinearTable16, EncodingTable16, lut_interp_linear_float};

// 16-bit linearization (65536 entries)
let lut = LinearTable16::new();
let linear = lut.lookup(32768);

// Interpolated encoding
let encode_lut = EncodingTable16::new();
let srgb = lut_interp_linear_float(0.5, encode_lut.as_slice());

API Summary

Data Function
&mut [f32] srgb_to_linear_slice / linear_to_srgb_slice
RGBA &mut [f32] srgb_to_linear_rgba_slice / linear_to_srgb_rgba_slice
RGBA f32 premultiply srgb_to_linear_premultiply_rgba_slice / unpremultiply_linear_to_srgb_rgba_slice
&[u8]&mut [f32] srgb_u8_to_linear_slice / linear_to_srgb_u8_slice
RGBA &[u8]&mut [f32] srgb_u8_to_linear_rgba_slice / linear_to_srgb_u8_rgba_slice
RGBA u8↔f32 premultiply srgb_u8_to_linear_premultiply_rgba_slice / unpremultiply_linear_to_srgb_u8_rgba_slice
&[u16]&mut [f32] srgb_u16_to_linear_slice / linear_to_srgb_u16_slice
Custom gamma &mut [f32] gamma_to_linear_slice / linear_to_gamma_slice
Custom gamma RGBA premul gamma_to_linear_premultiply_rgba_slice / unpremultiply_linear_to_gamma_rgba_slice
Single f32 srgb_to_linear / linear_to_srgb
Single u8 srgb_u8_to_linear / linear_to_srgb_u8
Single u16 srgb_u16_to_linear / linear_to_srgb_u16
Exact powf f32/f64 precise::srgb_to_linear / precise::linear_to_srgb
Extended range f32 precise::srgb_to_linear_extended / precise::linear_to_srgb_extended

All functions live in linear_srgb::default unless noted.

Accuracy

Transfer function constants

All code paths use C0-continuous constants derived from the moxcms reference implementation. These adjust the IEC 61966-2-1 offset from 0.055 to 0.055011 and the threshold from 0.04045 to 0.03929, making the piecewise transfer function mathematically continuous (~2.3e-9 gap eliminated).

At u8 precision the two constant sets produce identical values. At u16, the max difference is ~1 LSB near the threshold. See docs/iec.md for a detailed comparison.

For interop with software that uses the original IEC textbook constants, enable the iec feature for linear_srgb::iec::srgb_to_linear / linear_srgb::iec::linear_to_srgb.

Accuracy summary (exhaustive f32 sweep)

Path Max ULP Avg ULP Monotonic
default s→l (rational poly) 11 ~0.5 yes
default l→s (rational poly) 14 ~0.4 yes
precise s→l (powf) 6 ~0.1 yes
precise l→s (powf) 3 ~0.1 yes

What does 14 ULP mean in practice? 1 ULP (unit in the last place) is the spacing between adjacent f32 values at a given magnitude. At 0.5 that's ~6e-8, so 14 ULP ≈ 8e-7 — about 6 decimal digits of precision. At 0.01 it's ~1e-8. For any 8-bit or 16-bit output, this error is invisible — it's thousands of times smaller than one output level.

Reference: C0-continuous f64 powf. The scalar rational polynomial evaluates in f64 intermediate precision, guaranteeing perfect monotonicity (zero reversals across all ~1B f32 values in [0, 1]). SIMD paths use f32 evaluation for throughput and are also monotonic within each segment.

Feature Flags

  • std (default): Required for runtime SIMD dispatch
  • avx512 (default): AVX-512 code paths (16-wide f32)
  • transfer: BT.709, PQ, HLG transfer functions (scalar + SIMD)
  • iec: IEC 61966-2-1 textbook sRGB functions for legacy interop
  • alt: Alternative/experimental implementations for benchmarking
# no_std (requires alloc for LUT generation)
linear-srgb = { version = "0.6", default-features = false }

Module Organization

  • default — Recommended API. Rational polynomial for f32, LUT for integers, SIMD for slices.
  • precise — Exact powf() conversions with C0-continuous constants (not IEC textbook). f32/f64, extended range.
  • lut — Lookup tables for custom bit depths (10-bit, 12-bit, 16-bit).
  • tf — Transfer functions: BT.709, PQ, HLG. Requires transfer feature.
  • iec — IEC 61966-2-1 textbook constants for legacy interop. Requires iec feature.
  • tokens — Inlineable #[rite] functions for embedding in SIMD pipelines (see below).

Embedding in SIMD Pipelines (tokens module)

If you're writing your own SIMD code with archmage, the tokens module provides #[rite] functions that inline directly into your #[arcane] functions — zero dispatch overhead.

use linear_srgb::tokens::x8;
use archmage::arcane;

#[arcane]
fn my_pipeline(token: X64V3Token, data: &mut [f32]) {
    // x8::srgb_to_linear_v3 is #[rite] — inlines into your function
    // Available widths: x4 (SSE/NEON/WASM), x8 (AVX2), x16 (AVX-512)
}

License

MIT OR Apache-2.0

AI-Generated Code Notice

Developed with Claude (Anthropic). All code has been reviewed and benchmarked, but verify critical paths for your use case.