pub struct SectionReader<R: ?Sized> { /* private fields */ }Expand description
The wrapper type for reading sections from a random access reader.
The inner type should implement positioned_io::ReadAt to support
efficient random access. Typically, std::fs::File should be used.
You do NOT need additional buffering.
Note: It’s discouraged to use positioned_io::RandomAccessFile on *NIX
platforms because that would disable readahead which can hurt performance on
sequential read inside a several MiB section.
On Windows, however, RandomAccessFile is several times faster than File.
Implementations§
Source§impl<R> SectionReader<R>
impl<R> SectionReader<R>
Sourcepub fn new(rdr: R) -> Self
pub fn new(rdr: R) -> Self
Create a new section reader wrapping an existing random access stream,
typically, std::fs::File.
You should NOT use BufReader because sections
are large enough and high-level abstractions like
Archive already has internal caching.
Sourcepub fn new_with_offset(rdr: R, archive_start: u64) -> Self
pub fn new_with_offset(rdr: R, archive_start: u64) -> Self
Same as Self::new but indicates the DwarFS archive is located at
archive_start in rdr instead of the start. This is also known as
image_offset.
All read methods of SectionReader will add archive_start to its
parameter for the real file offset if necessary.
Source§impl<R: ?Sized> SectionReader<R>
impl<R: ?Sized> SectionReader<R>
Source§impl<R: ReadAt + ?Sized> SectionReader<R>
impl<R: ReadAt + ?Sized> SectionReader<R>
Sourcepub fn archive_start(&self) -> u64
pub fn archive_start(&self) -> u64
Get the archive_start set on creation.
Sourcepub fn read_section_at(
&mut self,
section_offset: u64,
payload_size_limit: usize,
) -> Result<(Header, Vec<u8>), Error>
pub fn read_section_at( &mut self, section_offset: u64, payload_size_limit: usize, ) -> Result<(Header, Vec<u8>), Error>
Read and decompress a full section at offset into memory.
This is a shortcut to call read_header_at and
read_payload_at.
§Errors
See read_header_at and read_payload_at.
Sourcepub fn read_header_at(&mut self, section_offset: u64) -> Result<Header, Error>
pub fn read_header_at(&mut self, section_offset: u64) -> Result<Header, Error>
Read a section header at section_offset.
§Errors
Returns Err if section offset overflows, the underlying read operation
fails, header magic is invalid or header DwarFS version is unsupported.
Sourcepub fn read_payload_at(
&mut self,
header: &Header,
payload_offset: u64,
payload_size_limit: usize,
) -> Result<Vec<u8>, Error>
pub fn read_payload_at( &mut self, header: &Header, payload_offset: u64, payload_size_limit: usize, ) -> Result<Vec<u8>, Error>
Read and decompress section payload of given header into a owned Vec<u8>.
Same as read_payload_at_into but returns
an Vec<u8> for convenience.
§Errors
See read_payload_at_into.
Sourcepub fn read_payload_at_into(
&mut self,
header: &Header,
payload_offset: u64,
out: &mut [u8],
) -> Result<usize, Error>
pub fn read_payload_at_into( &mut self, header: &Header, payload_offset: u64, out: &mut [u8], ) -> Result<usize, Error>
Read and decompress section payload of given header into a buffer.
payload_offset is the offset of the body of a section (after the header),
from the start of archive. Both compressed and decompressed size must
be within the out.len(), or an error will be emitted.
§Errors
Returns Err if either:
- Payload offset overflows
- Payload size exceeds the limit.
- The underlying read operation fails.
- Fast checksum (XXH3-64) of payload disagrees with the header.
- Decompression fails. This includes decompressed size exceeding the limit.
Sourcepub fn build_section_index(
&mut self,
stream_len: u64,
size_limit: usize,
) -> Result<Vec<SectionIndexEntry>, Error>
pub fn build_section_index( &mut self, stream_len: u64, size_limit: usize, ) -> Result<Vec<SectionIndexEntry>, Error>
Construct the section index by traversing all sections.
This will traverse sections one-by-one from archive_start to the end
of stream. All headers will be parsed and validated, but their payloads
will not.
Note: This may be very costly for large archives or on HDDs because it does too many seeks on the disk.
§Errors
Return Err if fails to parse or validate section headers (see
SectionReader::read_header_at), or if section offset exceeds 48bits,
which is not representable in section index.
Sourcepub fn read_section_index(
&mut self,
stream_len: u64,
payload_size_limit: usize,
) -> Result<Option<(Header, Vec<SectionIndexEntry>)>, Error>
pub fn read_section_index( &mut self, stream_len: u64, payload_size_limit: usize, ) -> Result<Option<(Header, Vec<SectionIndexEntry>)>, Error>
Locate and read the section index, if there is any, with a limited payload size.
stream_len is the total size of the input reader R, which is
typically the whole file size.
§Detection behaviors
Since there are currently no reliable way to know if there is a section index, the tail could just “looks like an index by chance” or being collided to like an index intentionally. Currently we do a best-effort detection as follows, but it may change in the future.
-
If the header of the first section indicates a DwarFS version without section index support, there must not be an index, and
Ok(None)is returned. -
Otherwise, read 8 bytes at the end. If it does not look like a valid self-pointing
SectionIndexEntry,Ok(None)is returned. -
If it seems to be valid, follows its offset and read a section header. The header should be like a valid section index capturing the trailing 8 bytes, or
Ok(None)is returned. -
The content of section index is read. It should have a matched checksum, sorted entries with valid section types. If it all passes,
Ok(Some((header, section_index)))is returned, otherwiseOk(None)is returned.This should rule out the possibility that a mocked offset with a mocked section header enclosing multiple real sections inside. Because if there is a valid
Headerplaced inside section index, the magic-version “DWARFSab” would be interpreted as an invalid section type, causing the index to be rejected.
See more discussion: https://github.com/mhx/dwarfs/issues/264
§Errors
Returns Err for underlying I/O hard-errors.
Ok(None) will be returned instead for soft-errors that occur during
parsing the may-not-exist section index.