Expand description
Low level framing of tar streams.
This crate provides two APIs:
streamis a low-level, lossless per-block framing API.logicalis a medium-level, assembled member reader API.
stream provides the basic static machine enforcement for a tar
stream, including ensuring that any given stream is either strictly
pax or GNU and not a mix of the two. logical is layered on top
of stream and provides APIs for accessing the “effective” metadata
for each assembled member.
This crate tries to faithfully extract pax or GNU entries without mixing the two. See the sections below for compatibility notes.
§pax compatibility
When decoding pax-formatted tar streams, tar-framing attempts to conform to pax as specified in POSIX.1-2024, i.e. “issue 8” of the POSIX specification. See the pax specification for full details.
However, there are a few small deviations from a pedantic reading of POSIX.1-2024 that are worth noting:
-
tar-framing permits a
ctimepax record, despite not being specified in POSIX.1-2024. The ctime record was removed from pax in POSIX.1-2004 (which is itself a minor edit of POSIX.1-2001). However, many real-world pax archives still contain it, and its presence does not compromise or introduce ambiguity during framing. -
tar-framing rejects directory entries (typeflag
'5') that present a nonzero size in their ustar header or paxsizerecord. pax says that this size should be treated as a filesystem allocation hint rather than a physical size, but real-world parsers vary widely in how they handle it (some ignore it, others skip over that number of bytes, etc.). -
tar-framing rejects regular file entries (typeflag
'0'or'\0') that include a trailing slash (e.g.foo.txt/). pax is ambiguous about to handle these cases: it notes that pre-ustar tar had no directory entry typeflag and thus a trailing slash was used to indicate a directory by convention, but does not prescribe that pax implementors honor this legacy behavior. We choose to reject it since it presents the same directory size problem mentioned above. -
tar-framing rejects negative timestamps as well as timestamps that would exceed the precision of a
u64. pax allows both of these, although it notes that portable timestamps cannot be negative and that tools may reject such timestamps. -
tar-framing silently removes fractional components from parsed timestamps. Timestamps are truncated to second precision.
-
tar-framing rejects typeflags that are not explicitly defined in pax. pax says to handle these as regular files (i.e. assuming their size is a physical size), but this has marginal benefit in practice.
-
tar-framing rejects
hdrcharsetpax records that aren’t UTF-8 orBINARY. pax says that “additional names may be agreed between the originator and the recipient,” but we are the recipient and we don’t accept any otherhdrcharsetnames.
§GNU compatibility
When decoding GNU-formatted tar streams, tar-framing attempts to follow the “Basic Tar Format” in the GNU docs. Specifically, tar-framing attempts to follow the rules for the “old GNU” format, i.e. GNU tar’s non-pax format.
tar-framing intentionally only supports a subset of the GNU tar format:
-
The GNU “longname” and “longlink” (
'L'and'K') typeflags are supported, with similar path-precedence semantics as their pax record equivalents. -
Other GNU-specific typeflags are not supported whatsoever, and produce a framing error. This includes sparse files (
'S') and multivolume headers ('M'). -
tar-framing accepts the GNU-specific “base-256” encoding for numbers, but rejects negative encodings as well as any value that would exceed the precision of a
u64. tar-framing also allows “base-256” encodings where the numeric value would fit into an octal encoding in the alloted buffer/byte span; GNU technically says that this is reserved for future use.
§General compatibility
Because pax and GNU both use ustar as their baseline, any compatibility aspect of pax that is derived from ustar also applies during GNU tar decoding.
tar-framing accepts wholly NUL mode, uid, gid, and mtime fields by default for
compatibility with real-world writers in both families. These fields are represented as
missing rather than assigned a value. This can be disabled with
stream::TarStream::set_allow_all_nul_numeric_fields.
Separately, higher-level crates (like tar-codec) may choose to apply additional restrictions when processing logical archive members. For example, a consumer of tar-framing may choose to reject vendor-specific pax records, or member names that contain forbidden characters, or any other additional restriction.
Modules§
- header
- logical
- Member-oriented reading above the lossless physical frame stream.
- stream
- Lossless, block-oriented tar streaming.
- write
- Strict POSIX-pax block construction.
Structs§
- Frame
Error - An error encountered at an absolute position in a tar stream.
- PaxExtension
- One positioned parsed pax extended header.
- PaxState
- Unified pax metadata state applicable to one ordinary member.
Enums§
- Archive
Format - An automatically detected, mutually exclusive tar archive family.
- Frame
Error Inner - Specific errors that can occur while processing tar frames.
- GnuKind
- The supported GNU metadata extension kinds.
- HdrCharset
- A character encoding for PAX pathname and user/group-name values.
- PaxError
- An error encountered while parsing pax extended-header records.
- PaxKeyword
- An owned, hashable pax extended-header keyword.
- PaxKind
- The scope of a pax extended header.
- PaxRecord
- A parsed pax extended-header record.
- PaxString
- A character value governed by the effective PAX
HdrCharset. - PaxValue
- A parsed pax value, including an explicit deletion tombstone.
- Ustar
Kind - A supported ordinary ustar member type.
Constants§
- BLOCK_
SIZE - The size of a logical tar record.
- DEFAULT_
MAX_ GLOBAL_ PAX_ EXTENSIONS_ SIZE - The default maximum cumulative size of global pax extensions before one member.
- DEFAULT_
MAX_ GNU_ EXTENSION_ SIZE - The default maximum size in bytes of one GNU metadata extension.
- DEFAULT_
MAX_ PAX_ EXTENSION_ SIZE - The default maximum size in bytes of one local or global pax extension.
Type Aliases§
- Block
- A single tar block.