Expand description
DVB-SI text decoding — ETSI EN 300 468 Annex A.
Covers the full Annex A Table A.3 selector set: the default Latin table
(Figure A.1, an ISO 6937 superset — see iso_6937_single), ISO 8859-n
(single-byte 0x01–0x0B and extended 0x10 forms), UCS-2 BE (0x11),
KS X 1001 Korean (0x12, decoded as EUC-KR), GB-2312 Simplified Chinese
(0x13, decoded via GBK which is a GB-2312 superset), Big5 Traditional
Chinese (0x14), UTF-8 (0x15), and the 0x1F encoding_type_id escape
(no ids are registered for broadcast use — yields U+FFFD). Reserved
selectors (0x08, 0x0C–0x0F, 0x16–0x1E) yield U+FFFD per byte.
Glyph mappings are pinned to EN 300 468 V1.19.1 (2025-02) Figure A.1
“Character code table 00 - Latin alphabet with Unicode equivalents”
(PDF p. 159, vendored at specs/etsi_en_300_468_v01.19.01_dvb_si.pdf;
transcription in dvb-si/docs/en_300_468.md).
DvbText wraps the raw wire bytes and decodes only on demand — parsing
stays zero-copy; decoding happens when you call DvbText::decode, Display,
or serde:
use dvb_si::text::{DvbText, LangCode};
// Leading 0x15 is the Annex A UTF-8 selector; "café" follows.
let name = DvbText::new(&[0x15, b'c', b'a', b'f', 0xC3, 0xA9]);
assert_eq!(name.decode(), "café");
assert_eq!(name.raw(), &[0x15, b'c', b'a', b'f', 0xC3, 0xA9]); // selector kept
// A selector-less default-Latin (ISO 6937) sequence: combining acute + e → é.
assert_eq!(DvbText::new(&[0xC2, b'e']).decode(), "é");
// LangCode is 3 raw bytes (ISO 639-2 / ISO 3166) decoded lossily on demand.
assert_eq!(LangCode(*b"fre").as_str(), "fre");Structs§
- DvbText
- Borrowed DVB-encoded text (EN 300 468 Annex A). Wraps the raw selector +
body bytes; decoding happens only on
DvbText::decode/Display/ serde — never in the parse hot path. - Lang
Code - ISO 639-2 language code or ISO 3166 country code — 3 raw bytes.
Functions§
- decode
- Convenience wrapper returning
Cow::Borrowedfor pure-ASCII input,Cow::Ownedotherwise. - decode_
dvb_ string - Decode a DVB text payload (e.g. short_event_descriptor event_name_char)
into an owned UTF-8
String. The first byte may be a charset indicator per ETSI EN 300 468 Annex A Table A.3.