Encoding detection and conversion for the HTML parser.
Detects the character encoding of raw HTML bytes and converts them to UTF-8. The detection pipeline follows the HTML specification's encoding sniffing algorithm:
- BOM (Byte Order Mark) detection
<meta charset="...">prescan (first 1 KB)<meta http-equiv="Content-Type" content="...charset=...">prescan- Fallback to UTF-8
The actual decoding is delegated to [encoding_rs], which is
SIMD-optimized by Mozilla/Servo.
Quick Start
use ;
let html = b"<html><head><meta charset=\"utf-8\"></head><body>Hello</body></html>";
let encoding = detect;
assert_eq!;
let = decode_or_detect.unwrap;
assert!;