pub struct Parser { /* private fields */ }Expand description
Sans-IO tar archive parser.
This parser operates as a state machine on &[u8] input slices.
It does not perform any I/O itself - the caller is responsible for
providing data and handling the parsed events.
§Usage
The caller feeds header bytes to parse(). On Entry, the caller
reads/skips entry.size bytes of content (plus padding to the next
512-byte boundary) from its own I/O source, then calls parse()
again with the next header bytes. The parser does not see or track
content bytes.
let mut parser = Parser::new(Limits::default());
let mut buf = vec![0u8; 65536];
let mut filled = 0;
loop {
match parser.parse(&buf[..filled]) {
Ok(ParseEvent::NeedData { min_bytes }) => {
let n = read_more(&mut buf[filled..])?;
filled += n;
if n == 0 && filled < min_bytes {
return Err("unexpected EOF");
}
}
Ok(ParseEvent::Entry { consumed, entry }) => {
process_entry(&entry);
// Read/skip entry.size bytes + padding, then clear buf
skip_content(entry.padded_size())?;
filled = 0;
}
Ok(ParseEvent::End { .. }) => break,
Err(e) => return Err(e),
}
}Implementations§
Source§impl Parser
impl Parser
Sourcepub fn set_allow_empty_path(&mut self, allow: bool)
pub fn set_allow_empty_path(&mut self, allow: bool)
Allow entries with empty paths instead of rejecting them with
ParseError::EmptyPath.
Sourcepub fn set_verify_checksums(&mut self, verify: bool)
pub fn set_verify_checksums(&mut self, verify: bool)
Control whether header checksums are verified during parsing.
When set to false, the parser skips Header::verify_checksum
calls, accepting headers regardless of their checksum field. This
is primarily useful for fuzz testing, where random input almost
never produces valid checksums, preventing the fuzzer from reaching
deeper parser code paths.
Default: true.
Sourcepub fn set_ignore_pax_errors(&mut self, ignore: bool)
pub fn set_ignore_pax_errors(&mut self, ignore: bool)
Control whether malformed PAX extension values are silently ignored.
When set to true, PAX values that fail to parse (invalid UTF-8,
unparseable integers for uid, gid, size, mtime) are skipped
instead of producing ParseError::InvalidPaxValue errors. This
matches the lenient behavior of many real-world tar implementations.
Default: false (malformed values produce errors).
Sourcepub fn with_defaults() -> Self
pub fn with_defaults() -> Self
Create a new parser with default limits.
Sourcepub fn parse<'a>(&mut self, input: &'a [u8]) -> Result<ParseEvent<'a>>
pub fn parse<'a>(&mut self, input: &'a [u8]) -> Result<ParseEvent<'a>>
Parse the next event from the input buffer.
Returns a ParseEvent on success. Entry and End events include
a consumed field indicating how many bytes were consumed from the
input; the caller should advance past that many bytes in their buffer.
§Events
NeedData { min_bytes }: Need at leastmin_bytesmore data (nothing consumed)Entry { consumed, entry }: A complete entry header; caller must handle contentEnd { consumed }: Archive is complete
After receiving an Entry event, the caller is responsible for
reading or skipping entry.size bytes of content (plus padding to
the next 512-byte boundary) before calling parse() again.