Module determinism

Module determinism 

Source
Expand description

§Determinism Configuration and Enforcement

This module provides the core determinism guarantees that make DDEX Builder unique in the market. By ensuring consistent reproducible output, we enable supply chain integrity, reproducible builds, and cryptographic signing.

§Core Principle

Same Input = Identical Output, Always

DDEX Builder guarantees that identical logical input will always produce byte-identical XML output, regardless of:

  • Build environment (dev, CI, production)
  • Operating system (Windows, macOS, Linux)
  • Hardware architecture (x86, ARM, M1/M2)
  • Rust version or compiler flags
  • Time of day or system locale

§Why Determinism Matters

Deterministic Benefits
┌─────────────────┐    ┌──────────────────┐    ┌─────────────────┐
│ Supply Chain    │    │ Reproducible     │    │ Digital         │
│ Integrity       │    │ Builds           │    │ Signatures      │
└─────────────────┘    └──────────────────┘    └─────────────────┘
         │                       │                       │
         ▼                       ▼                       ▼
  ┌─────────────┐       ┌─────────────────┐    ┌─────────────────┐
  │ • Audit     │       │ • CI/CD Cache   │    │ • Crypto Valid  │
  │ • Verify    │       │ • Artifact      │    │ • Non-repudiat  │
  │ • Trust     │       │   Dedup         │    │ • Compliance    │
  │ • Detect    │       │ • Build Reprod  │    │ • Legal Proof   │
  └─────────────┘       └─────────────────┘    └─────────────────┘

§Implementation Strategy

CRITICAL: This module ensures deterministic output by using IndexMap everywhere instead of HashMap/HashSet. The clippy configuration enforces this.

§Key Components

  1. DB-C14N/1.0 Canonicalization: Our custom canonicalization spec
  2. Deterministic Data Structures: IndexMap for stable iteration order
  3. Fixed Randomness Sources: Locked namespace prefixes and IDs
  4. Normalized Formatting: Consistent whitespace, encoding, line endings
  5. Time Zone Handling: UTC normalization for timestamps

§Configuration Example

use ddex_builder::determinism::*;
use indexmap::IndexMap;

let mut config = DeterminismConfig::default();

// Enable strict determinism verification
config.verify_determinism = Some(5); // Test with 5 iterations

// Lock namespace prefixes
config.locked_prefixes.insert(
    "http://ddex.net/xml/ern/43".to_string(),
    "ern".to_string()
);

// Use custom element ordering
let mut release_order = IndexMap::new();
release_order.insert("Release".to_string(), vec![
    "ReleaseReference".to_string(),
    "ReleaseId".to_string(),
    "ReferenceTitle".to_string(),
]);
config.custom_sort_order = Some(release_order);

// Apply configuration to builder
let mut builder = Builder::new();
builder.set_determinism_config(config);

§Verification Process

The determinism verification process works by:

  1. Build XML using the same input multiple times
  2. Compare Bytes - every byte must be identical
  3. Hash Verification - SHA-256 hashes must match
  4. Failure Detection - any variance triggers detailed diff analysis
// Automatic verification during build
let config = DeterminismConfig {
    verify_determinism: Some(3), // 3 verification rounds
    ..Default::default()
};

let result = builder.build_with_verification(&request, &config)?;
// If determinism fails, build returns detailed error with diff

§Performance Impact

Determinism adds minimal overhead:

  • +0.1-0.5ms for IndexMap vs HashMap
  • +1-3ms for verification when enabled
  • +5-10% memory for deterministic data structures
  • Zero impact on functionality or correctness

The performance cost is negligible compared to the benefits of supply chain integrity and reproducible builds.

Structs§

DeterminismConfig
Determinism configuration for XML generation
DeterminismDifference
Information about a determinism difference
DeterminismResult
Determinism verification result
DeterminismStats
Runtime statistics for determinism verification
DeterminismVerifier
Determinism verifier with comprehensive analysis
DifferenceContext
Context around a difference
HashDifference
Hash comparison details
LengthDifference
Length comparison details

Enums§

CanonMode
Canonicalization mode
DateTimeFormat
Date/time format
IndentChar
Indentation character
LineEnding
Line ending style
NamespaceStrategy
Namespace handling strategy
OutputMode
Output formatting mode
QuoteStyle
Quote style for attributes
SortStrategy
Element ordering strategy
TimeZonePolicy
Time zone policy
UnicodeNormalization
Unicode normalization form
XmlCharacterPolicy
XML character handling policy