SectionReader

Struct SectionReader 

Source
pub struct SectionReader<R: ?Sized> { /* private fields */ }
Expand description

The wrapper type for reading sections from a random access reader.

The inner type should implement positioned_io::ReadAt to support efficient random access. Typically, std::fs::File should be used. You do NOT need additional buffering.

Note: It’s discouraged to use positioned_io::RandomAccessFile on *NIX platforms because that would disable readahead which can hurt performance on sequential read inside a several MiB section. On Windows, however, RandomAccessFile is several times faster than File.

Implementations§

Source§

impl<R> SectionReader<R>

Source

pub fn new(rdr: R) -> Self

Create a new section reader wrapping an existing random access stream, typically, std::fs::File.

You should NOT use BufReader because sections are large enough and high-level abstractions like Archive already has internal caching.

Source

pub fn new_with_offset(rdr: R, archive_start: u64) -> Self

Same as Self::new but indicates the DwarFS archive is located at archive_start in rdr instead of the start. This is also known as image_offset.

All read methods of SectionReader will add archive_start to its parameter for the real file offset if necessary.

Source§

impl<R: ?Sized> SectionReader<R>

Source

pub fn get_ref(&self) -> &R

Get a reference to the underlying reader.

Source

pub fn get_mut(&mut self) -> &mut R

Get a mutable reference to the underlying reader.

Source

pub fn into_inner(self) -> R
where R: Sized,

Retrieve the ownership of the underlying reader.

Source§

impl<R: ReadAt + ?Sized> SectionReader<R>

Source

pub fn archive_start(&self) -> u64

Get the archive_start set on creation.

Source

pub fn read_section_at( &mut self, section_offset: u64, payload_size_limit: usize, ) -> Result<(Header, Vec<u8>), Error>

Read and decompress a full section at offset into memory.

This is a shortcut to call read_header_at and read_payload_at.

§Errors

See read_header_at and read_payload_at.

Source

pub fn read_header_at(&mut self, section_offset: u64) -> Result<Header, Error>

Read a section header at section_offset.

§Errors

Returns Err if section offset overflows, the underlying read operation fails, header magic is invalid or header DwarFS version is unsupported.

Source

pub fn read_payload_at( &mut self, header: &Header, payload_offset: u64, payload_size_limit: usize, ) -> Result<Vec<u8>, Error>

Read and decompress section payload of given header into a owned Vec<u8>.

Same as read_payload_at_into but returns an Vec<u8> for convenience.

§Errors

See read_payload_at_into.

Source

pub fn read_payload_at_into( &mut self, header: &Header, payload_offset: u64, out: &mut [u8], ) -> Result<usize, Error>

Read and decompress section payload of given header into a buffer.

payload_offset is the offset of the body of a section (after the header), from the start of archive. Both compressed and decompressed size must be within the out.len(), or an error will be emitted.

§Errors

Returns Err if either:

  • Payload offset overflows
  • Payload size exceeds the limit.
  • The underlying read operation fails.
  • Fast checksum (XXH3-64) of payload disagrees with the header.
  • Decompression fails. This includes decompressed size exceeding the limit.
Source

pub fn build_section_index( &mut self, stream_len: u64, size_limit: usize, ) -> Result<Vec<SectionIndexEntry>, Error>

Construct the section index by traversing all sections.

This will traverse sections one-by-one from archive_start to the end of stream. All headers will be parsed and validated, but their payloads will not.

Note: This may be very costly for large archives or on HDDs because it does too many seeks on the disk.

§Errors

Return Err if fails to parse or validate section headers (see SectionReader::read_header_at), or if section offset exceeds 48bits, which is not representable in section index.

Source

pub fn read_section_index( &mut self, stream_len: u64, payload_size_limit: usize, ) -> Result<Option<(Header, Vec<SectionIndexEntry>)>, Error>

Locate and read the section index, if there is any, with a limited payload size.

stream_len is the total size of the input reader R, which is typically the whole file size.

§Detection behaviors

Since there are currently no reliable way to know if there is a section index, the tail could just “looks like an index by chance” or being collided to like an index intentionally. Currently we do a best-effort detection as follows, but it may change in the future.

  1. If the header of the first section indicates a DwarFS version without section index support, there must not be an index, and Ok(None) is returned.

  2. Otherwise, read 8 bytes at the end. If it does not look like a valid self-pointing SectionIndexEntry, Ok(None) is returned.

  3. If it seems to be valid, follows its offset and read a section header. The header should be like a valid section index capturing the trailing 8 bytes, or Ok(None) is returned.

  4. The content of section index is read. It should have a matched checksum, sorted entries with valid section types. If it all passes, Ok(Some((header, section_index))) is returned, otherwise Ok(None) is returned.

    This should rule out the possibility that a mocked offset with a mocked section header enclosing multiple real sections inside. Because if there is a valid Header placed inside section index, the magic-version “DWARFSab” would be interpreted as an invalid section type, causing the index to be rejected.

See more discussion: https://github.com/mhx/dwarfs/issues/264

§Errors

Returns Err for underlying I/O hard-errors.

Ok(None) will be returned instead for soft-errors that occur during parsing the may-not-exist section index.

Trait Implementations§

Source§

impl<R: Debug + ?Sized> Debug for SectionReader<R>

Source§

fn fmt(&self, f: &mut Formatter<'_>) -> Result

Formats the value using the given formatter. Read more

Auto Trait Implementations§

§

impl<R> Freeze for SectionReader<R>
where R: Freeze + ?Sized,

§

impl<R> RefUnwindSafe for SectionReader<R>
where R: RefUnwindSafe + ?Sized,

§

impl<R> Send for SectionReader<R>
where R: Send + ?Sized,

§

impl<R> Sync for SectionReader<R>
where R: Sync + ?Sized,

§

impl<R> Unpin for SectionReader<R>
where R: Unpin + ?Sized,

§

impl<R> UnwindSafe for SectionReader<R>
where R: UnwindSafe + ?Sized,

Blanket Implementations§

Source§

impl<T> Any for T
where T: 'static + ?Sized,

Source§

fn type_id(&self) -> TypeId

Gets the TypeId of self. Read more
Source§

impl<T> Borrow<T> for T
where T: ?Sized,

Source§

fn borrow(&self) -> &T

Immutably borrows from an owned value. Read more
Source§

impl<T> BorrowMut<T> for T
where T: ?Sized,

Source§

fn borrow_mut(&mut self) -> &mut T

Mutably borrows from an owned value. Read more
Source§

impl<T> From<T> for T

Source§

fn from(t: T) -> T

Returns the argument unchanged.

Source§

impl<T, U> Into<U> for T
where U: From<T>,

Source§

fn into(self) -> U

Calls U::from(self).

That is, this conversion is whatever the implementation of From<T> for U chooses to do.

Source§

impl<T> Same for T

Source§

type Output = T

Should always be Self
Source§

impl<T, U> TryFrom<U> for T
where U: Into<T>,

Source§

type Error = Infallible

The type returned in the event of a conversion error.
Source§

fn try_from(value: U) -> Result<T, <T as TryFrom<U>>::Error>

Performs the conversion.
Source§

impl<T, U> TryInto<U> for T
where U: TryFrom<T>,

Source§

type Error = <U as TryFrom<T>>::Error

The type returned in the event of a conversion error.
Source§

fn try_into(self) -> Result<U, <U as TryFrom<T>>::Error>

Performs the conversion.