Skip to main content

Crate hedl_c14n

Crate hedl_c14n 

Source
Expand description

HEDL Canonicalization

Provides deterministic output generation for HEDL documents. Canonical output ensures stable hashing, diffing, and round-trips.

§Overview

This crate implements the canonical serialization format for HEDL documents, as specified in SPEC.md Section 13.2. Canonicalization ensures:

  • Deterministic output: Same document always produces same output
  • Idempotency: canonicalize(canonicalize(x)) == canonicalize(x)
  • Round-trip preservation: Parsing canonical output preserves semantics
  • Stable hashing: Enables content-addressable storage and diffing

§Features

  • Minimal or always-quote string formatting strategies
  • Legacy ditto support for pre-v2.0 documents
  • Proper escaping of quotes and control characters
  • Alphabetically sorted keys, aliases, and struct declarations
  • Count hints in STRUCT directives for performance optimization
  • Security: Recursion depth limits prevent stack overflow DoS attacks

§Examples

use hedl_c14n::{canonicalize, CanonicalConfig, CanonicalConfigBuilder, QuotingStrategy};
use hedl_core::Document;

// Simple canonicalization with defaults
let output = canonicalize(&doc)?;

// Custom configuration using fluent API
let config = CanonicalConfig::new()
    .with_quoting(QuotingStrategy::Always)
    .with_ditto(false);
let output = hedl_c14n::canonicalize_with_config(&doc, &config)?;

// Custom configuration using builder pattern
let config = CanonicalConfig::builder()
    .quoting(QuotingStrategy::Always)
    .use_ditto(false)
    .sort_keys(true)
    .build();
let output = hedl_c14n::canonicalize_with_config(&doc, &config)?;

§Security

This crate implements protection against denial-of-service attacks:

  • Recursion depth limit: Maximum nesting depth of 1000 levels prevents stack overflow
  • Proper escaping: All special characters are escaped to prevent injection attacks
  • Type safety: Rust’s type system prevents memory safety issues

§Performance

Several optimizations are implemented:

  • P0: Direct BTreeMap iteration eliminates key cloning (1.15x speedup, 10-15% fewer allocations)
  • P1: Pre-allocated output buffer (1.2-1.3x speedup)
  • P1: Cell buffer reuse across rows (1.05-1.1x speedup for large matrices)
  • Count hints: add_count_hints() function to automatically add count hints to matrix lists

Structs§

CanonicalConfig
Configuration for canonical output format.
CanonicalConfigBuilder
Builder for constructing a CanonicalConfig with a chainable API.
CanonicalWriter
Writer for canonical HEDL output.

Enums§

QuotingStrategy
Quoting strategy for string values.

Functions§

add_count_hints
Recursively add count hints to all matrix lists in the document.
can_use_ditto
Check if a value can use ditto marker from previous row.
canonicalize
Canonicalize a HEDL document to a string.
canonicalize_with_config
Canonicalize a HEDL document with custom configuration.