pub struct EdifactTokenizer { /* private fields */ }Expand description
Tokenizes raw EDIFACT byte input into segment strings.
Handles release character escaping, whitespace normalization (strips \r\n), and UNA segment detection.
Implementations§
Source§impl EdifactTokenizer
impl EdifactTokenizer
Sourcepub fn new(delimiters: EdifactDelimiters) -> Self
pub fn new(delimiters: EdifactDelimiters) -> Self
Creates a new tokenizer with the given delimiters.
Sourcepub fn delimiters(&self) -> &EdifactDelimiters
pub fn delimiters(&self) -> &EdifactDelimiters
Returns the delimiters used by this tokenizer.
Sourcepub fn tokenize_segments<'a>(&self, input: &'a [u8]) -> SegmentIter<'a>
pub fn tokenize_segments<'a>(&self, input: &'a [u8]) -> SegmentIter<'a>
Tokenizes EDIFACT input into segment strings.
Splits on segment terminator, respecting release character escaping.
Strips \r and \n characters from the input (EDIFACT uses them
only for readability).
Each yielded string is a segment WITHOUT its terminator character.
Sourcepub fn tokenize_elements<'a>(&self, segment: &'a str) -> ElementIter<'a>
pub fn tokenize_elements<'a>(&self, segment: &'a str) -> ElementIter<'a>
Tokenizes a segment string into data elements.
Splits on element separator, preserving release character escaping (unescaping happens at the component level).
Sourcepub fn tokenize_components<'a>(&self, element: &'a str) -> ComponentIter<'a>
pub fn tokenize_components<'a>(&self, element: &'a str) -> ComponentIter<'a>
Tokenizes a data element into components.
Splits on component separator and unescapes release character sequences.