Skip to main content

preprocess

Function preprocess 

Source
pub fn preprocess(
    input: &[u8],
    limits: &Limits,
) -> HedlResult<PreprocessedInput>
Expand description

Preprocess raw input bytes into lines.

This handles:

  • UTF-8 validation
  • BOM skipping
  • CRLF normalization
  • Bare CR rejection
  • Control character validation (SIMD-optimized)
  • Size and line length limits
  • Line boundary detection (SIMD-accelerated with memchr)

ยงPerformance

Uses SIMD-accelerated newline scanning via memchr for 4-20x faster preprocessing on large files (> 1 MB).