Skip to main content

Archive

Struct Archive 

Source
pub struct Archive {
    pub header: Header,
    pub metadata: Option<Vec<u8>>,
    /* private fields */
}
Expand description

Read-only interface for accessing Hexz archive data.

Archive is the primary API for reading compressed, block-indexed archives. It handles:

  • Logical-to-Physical Mapping: Translates byte offsets to blocks via index pages.
  • Compression: Transparent decompression using LZ4 or Zstandard.
  • Encryption: Transparent decryption using AES-256-GCM.
  • Caching: Two-level caching (L1 decompressed blocks, L2 index pages).
  • Thin Archives: Resolves missing blocks from parent archives.
  • Prefetching: Asynchronous background loading of sequential blocks.

§Thread Safety

Archive is Send + Sync. All methods are thread-safe and utilize sharded locks to minimize contention during concurrent reads.

Fields§

§header: Header

Archive metadata (sizes, compression, encryption settings)

§metadata: Option<Vec<u8>>

Decoded metadata bytes from the metadata section

Implementations§

Source§

impl Archive

Source

pub fn open( backend: Arc<dyn StorageBackend>, encryptor: Option<Box<dyn Encryptor>>, ) -> Result<Arc<Self>>

Opens a Hexz archive with default cache settings.

This is the primary constructor for Archive. It:

  1. Reads and validates the archive header (magic bytes, version)
  2. Deserializes the master index
  3. Recursively loads parent archives (for thin archives)
  4. Initializes block and page caches
§Parameters
  • backend: Implementation of StorageBackend (Local file, S3, etc.)
  • encryptor: Optional decryptor (required if archive is encrypted)
§Errors
  • Error::Io: Backend I/O failure or file not found.
  • Error::Format: Invalid magic bytes or corrupted header.
  • Error::Encryption: Missing or incorrect encryption key.
§Example
let backend = Arc::new(FileBackend::new("data.hxz".as_ref())?);
let archive = Archive::open(backend, None)?;

println!("Main size: {} bytes", archive.size(ArchiveStream::Main));
Source

pub fn open_with_cache( backend: Arc<dyn StorageBackend>, encryptor: Option<Box<dyn Encryptor>>, cache_capacity_bytes: Option<usize>, prefetch_window_size: Option<u32>, ) -> Result<Arc<Self>>

Like open but with custom cache capacity.

Source

pub fn new( backend: Arc<dyn StorageBackend>, compressor: Box<dyn Compressor>, encryptor: Option<Box<dyn Encryptor>>, ) -> Result<Arc<Self>>

Primary constructor for manual Archive initialization.

This is the primary constructor used by hexz-store to supply a configured compressor and backend.

Source

pub fn with_cache( backend: Arc<dyn StorageBackend>, compressor: Box<dyn Compressor>, encryptor: Option<Box<dyn Encryptor>>, cache_capacity_bytes: Option<usize>, prefetch_window_size: Option<u32>, ) -> Result<Arc<Self>>

Opens a Hexz archive with custom cache capacity and prefetching.

Source

pub fn with_cache_and_loader( backend: Arc<dyn StorageBackend>, compressor: Box<dyn Compressor>, encryptor: Option<Box<dyn Encryptor>>, cache_capacity_bytes: Option<usize>, prefetch_window_size: Option<u32>, parent_loader: Option<&ParentLoader>, ) -> Result<Arc<Self>>

Like with_cache but accepts an optional parent loader.

The parent_loader is used to resolve parent archives for thin archives. If an archive declares parents but no loader is provided, blocks referring to parents will return zeros.

Source

pub const fn size(&self, stream: ArchiveStream) -> u64

Returns the logical size of a stream in bytes.

§Parameters
  • stream: The stream to query (Main or Auxiliary)
§Returns

The uncompressed, logical size of the stream. This is the size you would get if you decompressed all blocks and concatenated them.

§Examples
use hexz_core::{Archive, ArchiveStream};
let disk_bytes = archive.size(ArchiveStream::Main);
let mem_bytes = archive.size(ArchiveStream::Auxiliary);

println!("Main: {} GB", disk_bytes / (1024 * 1024 * 1024));
println!("Auxiliary: {} MB", mem_bytes / (1024 * 1024));
Source

pub fn prefetch_spawn_count(&self) -> u64

Returns the total number of prefetch operations spawned since this file was opened. Returns 0 if prefetching is disabled.

Source

pub fn read_block( &self, stream: ArchiveStream, block_idx: u64, info: &BlockInfo, ) -> Result<Bytes>

Reads a single block from this archive.

Source

pub fn get_block_by_hash( &self, hash: &[u8; 32], ) -> Result<Option<(ArchiveStream, u64, BlockInfo)>>

Finds a block in this archive by its hash.

Source

pub fn iter_block_hashes(&self, stream: ArchiveStream) -> Result<Vec<[u8; 32]>>

Iterates all non-sparse block hashes for the given stream.

Used by hexz-ops to build a ParentIndex for cross-file deduplication without requiring access to private fields.

Source

pub fn get_block_info( &self, stream: ArchiveStream, offset: u64, ) -> Result<Option<(u64, BlockInfo)>>

Returns the block metadata for a given logical offset.

Source

pub fn read_at( self: &Arc<Self>, stream: ArchiveStream, offset: u64, len: usize, ) -> Result<Vec<u8>>

Reads data from an archive stream at a given offset.

This is the main read method for random access. It:

  1. Identifies which blocks overlap the requested range
  2. Fetches blocks from cache or decompresses from storage
  3. Handles thin archive fallback to parent
  4. Assembles the final buffer from block slices
§Parameters
  • stream: Which stream to read from (Main or Auxiliary)
  • offset: Logical byte offset in the stream
  • len: Number of bytes to read
§Returns

A Vec<u8> containing the requested data. If the request extends beyond the end of the stream, it is truncated. If it starts beyond the end, an empty vector is returned.

§Example
// Read first 512 bytes of main stream
let data = archive.read_at(ArchiveStream::Main, 0, 512)?;
Source

pub fn read_at_into( self: &Arc<Self>, stream: ArchiveStream, offset: u64, buffer: &mut [u8], ) -> Result<()>

Reads into a provided buffer. Unused suffix is zero-filled. Uses parallel decompression when spanning multiple blocks.

Source

pub fn read_at_into_uninit( self: &Arc<Self>, stream: ArchiveStream, offset: u64, buffer: &mut [MaybeUninit<u8>], ) -> Result<()>

Writes into uninitialized memory. Unused suffix is zero-filled. Uses parallel decompression when spanning multiple blocks.

On error: The buffer contents are undefined (possibly partially written).

Source

pub fn read_at_into_uninit_bytes( self: &Arc<Self>, stream: ArchiveStream, offset: u64, buf: &mut [u8], ) -> Result<()>

Like read_at_into_uninit but accepts &mut [u8]. Use from FFI (e.g. Python).

Trait Implementations§

Source§

impl Debug for Archive

Source§

fn fmt(&self, f: &mut Formatter<'_>) -> Result

Formats the value using the given formatter. Read more

Auto Trait Implementations§

Blanket Implementations§

Source§

impl<T> Any for T
where T: 'static + ?Sized,

Source§

fn type_id(&self) -> TypeId

Gets the TypeId of self. Read more
Source§

impl<T> Borrow<T> for T
where T: ?Sized,

Source§

fn borrow(&self) -> &T

Immutably borrows from an owned value. Read more
Source§

impl<T> BorrowMut<T> for T
where T: ?Sized,

Source§

fn borrow_mut(&mut self) -> &mut T

Mutably borrows from an owned value. Read more
Source§

impl<T> From<T> for T

Source§

fn from(t: T) -> T

Returns the argument unchanged.

Source§

impl<T> Instrument for T

Source§

fn instrument(self, span: Span) -> Instrumented<Self>

Instruments this type with the provided Span, returning an Instrumented wrapper. Read more
Source§

fn in_current_span(self) -> Instrumented<Self>

Instruments this type with the current Span, returning an Instrumented wrapper. Read more
Source§

impl<T, U> Into<U> for T
where U: From<T>,

Source§

fn into(self) -> U

Calls U::from(self).

That is, this conversion is whatever the implementation of From<T> for U chooses to do.

Source§

impl<T> IntoEither for T

Source§

fn into_either(self, into_left: bool) -> Either<Self, Self>

Converts self into a Left variant of Either<Self, Self> if into_left is true. Converts self into a Right variant of Either<Self, Self> otherwise. Read more
Source§

fn into_either_with<F>(self, into_left: F) -> Either<Self, Self>
where F: FnOnce(&Self) -> bool,

Converts self into a Left variant of Either<Self, Self> if into_left(&self) returns true. Converts self into a Right variant of Either<Self, Self> otherwise. Read more
Source§

impl<T> Pointable for T

Source§

const ALIGN: usize

The alignment of pointer.
Source§

type Init = T

The type for initializers.
Source§

unsafe fn init(init: <T as Pointable>::Init) -> usize

Initializes a with the given initializer. Read more
Source§

unsafe fn deref<'a>(ptr: usize) -> &'a T

Dereferences the given pointer. Read more
Source§

unsafe fn deref_mut<'a>(ptr: usize) -> &'a mut T

Mutably dereferences the given pointer. Read more
Source§

unsafe fn drop(ptr: usize)

Drops the object pointed to by the given pointer. Read more
Source§

impl<T> Same for T

Source§

type Output = T

Should always be Self
Source§

impl<T, U> TryFrom<U> for T
where U: Into<T>,

Source§

type Error = Infallible

The type returned in the event of a conversion error.
Source§

fn try_from(value: U) -> Result<T, <T as TryFrom<U>>::Error>

Performs the conversion.
Source§

impl<T, U> TryInto<U> for T
where U: TryFrom<T>,

Source§

type Error = <U as TryFrom<T>>::Error

The type returned in the event of a conversion error.
Source§

fn try_into(self) -> Result<U, <U as TryFrom<T>>::Error>

Performs the conversion.
Source§

impl<V, T> VZip<V> for T
where V: MultiLane<T>,

Source§

fn vzip(self) -> V

Source§

impl<T> WithSubscriber for T

Source§

fn with_subscriber<S>(self, subscriber: S) -> WithDispatch<Self>
where S: Into<Dispatch>,

Attaches the provided Subscriber to this type, returning a WithDispatch wrapper. Read more
Source§

fn with_current_subscriber(self) -> WithDispatch<Self>

Attaches the current default Subscriber to this type, returning a WithDispatch wrapper. Read more