Skip to main content

File

Struct File 

Source
pub struct File {
    pub header: Header,
    /* private fields */
}
Expand description

Read-only interface for accessing Hexz snapshot data.

File is the primary API for reading compressed, block-indexed snapshots. It handles:

  • Block-level decompression with LRU caching
  • Optional AES-256-GCM decryption
  • Thin snapshot parent chaining
  • Dual-stream access (disk and memory)
  • Random access with minimal I/O

§Thread Safety

File is Send + Sync and can be safely shared across threads via Arc. Internal caches use Mutex for synchronization.

§Performance

  • Cache hit latency: ~80μs (warm cache)
  • Cache miss latency: ~1ms (cold cache, local storage)
  • Sequential throughput: ~2-3 GB/s (NVMe + LZ4)
  • Memory overhead: ~150MB typical (configurable)

§Examples

§Basic Usage

use hexz_core::{File, SnapshotStream};
use hexz_core::store::local::FileBackend;
use hexz_core::algo::compression::lz4::Lz4Compressor;
use std::sync::Arc;

let backend = Arc::new(FileBackend::new("snapshot.hxz".as_ref())?);
let compressor = Box::new(Lz4Compressor::new());
let snapshot = File::new(backend, compressor, None)?;

// Read 4KB at offset 1MB
let data = snapshot.read_at(SnapshotStream::Disk, 1024 * 1024, 4096)?;
assert_eq!(data.len(), 4096);

§Thin Snapshots (with parent)

use hexz_core::File;
use hexz_core::store::local::FileBackend;
use hexz_core::algo::compression::lz4::Lz4Compressor;
use std::sync::Arc;

// Open base snapshot
let base_backend = Arc::new(FileBackend::new("base.hxz".as_ref())?);
let base = File::new(
    base_backend,
    Box::new(Lz4Compressor::new()),
    None
)?;

// The thin snapshot will automatically load its parent based on
// the parent_path field in the header
let thin_backend = Arc::new(FileBackend::new("incremental.hxz".as_ref())?);
let thin = File::new(
    thin_backend,
    Box::new(Lz4Compressor::new()),
    None
)?;

// Reads automatically fall back to base for unchanged blocks
let data = thin.read_at(hexz_core::SnapshotStream::Disk, 0, 4096)?;

Fields§

§header: Header

Snapshot metadata (sizes, compression, encryption settings)

Implementations§

Source§

impl File

Source

pub fn open( backend: Arc<dyn StorageBackend>, encryptor: Option<Box<dyn Encryptor>>, ) -> Result<Arc<Self>>

Opens a Hexz snapshot with default cache settings.

This is the primary constructor for File. It:

  1. Reads and validates the snapshot header (magic bytes, version)
  2. Deserializes the master index
  3. Recursively loads parent snapshots (for thin snapshots)
  4. Initializes block and page caches
§Parameters
  • backend: Storage backend (local file, HTTP, S3, etc.)
  • compressor: Compression algorithm matching the snapshot format
  • encryptor: Optional decryption handler (pass None for unencrypted snapshots)
§Returns
  • Ok(File) on success
  • Err(Error::Format) if magic bytes or version are invalid
  • Err(Error::Io) if storage backend fails
§Examples
use hexz_core::{File, SnapshotStream};
use hexz_core::store::local::FileBackend;
use hexz_core::algo::compression::lz4::Lz4Compressor;
use std::sync::Arc;

let backend = Arc::new(FileBackend::new("snapshot.hxz".as_ref())?);
let compressor = Box::new(Lz4Compressor::new());
let snapshot = File::new(backend, compressor, None)?;

println!("Disk size: {} bytes", snapshot.size(SnapshotStream::Disk));

Opens a snapshot, auto-detecting compression and dictionary from the header.

This eliminates the 3-step boilerplate of: read header, load dict, create compressor. Equivalent to File::new(backend, auto_compressor, encryptor).

Source

pub fn open_with_cache( backend: Arc<dyn StorageBackend>, encryptor: Option<Box<dyn Encryptor>>, cache_capacity_bytes: Option<usize>, prefetch_window_size: Option<u32>, ) -> Result<Arc<Self>>

Like open but with custom cache and prefetch settings.

Source

pub fn new( backend: Arc<dyn StorageBackend>, compressor: Box<dyn Compressor>, encryptor: Option<Box<dyn Encryptor>>, ) -> Result<Arc<Self>>

Source

pub fn with_cache( backend: Arc<dyn StorageBackend>, compressor: Box<dyn Compressor>, encryptor: Option<Box<dyn Encryptor>>, cache_capacity_bytes: Option<usize>, prefetch_window_size: Option<u32>, ) -> Result<Arc<Self>>

Opens a Hexz snapshot with custom cache capacity and prefetching.

Identical to new but allows specifying cache size and prefetch window.

§Parameters
  • backend: Storage backend
  • compressor: Compression algorithm
  • encryptor: Optional decryption handler
  • cache_capacity_bytes: Block cache size in bytes (default: ~400MB for 4KB blocks)
  • prefetch_window_size: Number of blocks to prefetch ahead (default: disabled)
§Cache Sizing

The cache stores decompressed blocks. Given a block size of 4KB:

  • Some(100_000_000) → ~24,000 blocks (~96MB effective)
  • None → 1000 blocks (~4MB effective)

Larger caches reduce repeated decompression but increase memory usage.

§Prefetching

When prefetch_window_size is set, the system will automatically fetch the next N blocks in the background after each read, optimizing sequential access patterns:

  • Some(4) → Prefetch 4 blocks ahead
  • None or Some(0) → Disable prefetching
§Examples
use hexz_core::File;
use hexz_core::store::local::FileBackend;
use hexz_core::algo::compression::lz4::Lz4Compressor;
use std::sync::Arc;

let backend = Arc::new(FileBackend::new("snapshot.hxz".as_ref())?);
let compressor = Box::new(Lz4Compressor::new());

// Allocate 256MB for cache, prefetch 4 blocks ahead
let snapshot = File::with_cache(
    backend,
    compressor,
    None,
    Some(256 * 1024 * 1024),
    Some(4)
)?;
Source

pub fn prefetch_spawn_count(&self) -> u64

Returns the logical size of a stream in bytes.

§Parameters
  • stream: The stream to query (Disk or Memory)
§Returns

The uncompressed, logical size of the stream. This is the size you would get if you decompressed all blocks and concatenated them.

§Examples
use hexz_core::{File, SnapshotStream};
let disk_bytes = snapshot.size(SnapshotStream::Disk);
let mem_bytes = snapshot.size(SnapshotStream::Memory);

println!("Disk: {} GB", disk_bytes / (1024 * 1024 * 1024));
println!("Memory: {} MB", mem_bytes / (1024 * 1024));

Returns the total number of prefetch operations spawned since this file was opened. Returns 0 if prefetching is disabled.

Source

pub fn size(&self, stream: SnapshotStream) -> u64

Source

pub fn read_at( self: &Arc<Self>, stream: SnapshotStream, offset: u64, len: usize, ) -> Result<Vec<u8>>

Reads data from a snapshot stream at a given offset.

This is the primary read method for random access. It:

  1. Identifies which blocks overlap the requested range
  2. Fetches blocks from cache or decompresses from storage
  3. Handles thin snapshot fallback to parent
  4. Assembles the final buffer from block slices
§Parameters
  • stream: Which stream to read from (Disk or Memory)
  • offset: Starting byte offset (0-indexed)
  • len: Number of bytes to read
§Returns

A Vec<u8> containing up to len bytes. The returned vector may be shorter if:

  • offset is beyond the stream size (returns empty vector)
  • offset + len exceeds stream size (returns partial data)

Missing data (sparse regions) is zero-filled.

§Errors
  • Error::Io if backend read fails (e.g. truncated file)
  • Error::Corruption(block_idx) if block checksum does not match
  • Error::Decompression if block decompression fails
  • Error::Decryption if block decryption fails
§Performance
  • Cache hit: ~80μs latency, no I/O
  • Cache miss: ~1ms latency (local storage), includes decompression
  • Remote storage: Latency depends on network (HTTP: ~50ms, S3: ~100ms)

Aligned reads (offset % block_size == 0) are most efficient.

§Examples
use hexz_core::{File, SnapshotStream};
// Read first 512 bytes of disk stream
let boot_sector = snapshot.read_at(SnapshotStream::Disk, 0, 512)?;

// Read from arbitrary offset
let chunk = snapshot.read_at(SnapshotStream::Disk, 1024 * 1024, 4096)?;

// Reading beyond stream size returns empty vector
let empty = snapshot.read_at(SnapshotStream::Disk, u64::MAX, 100)?;
assert!(empty.is_empty());

Reads a byte range. Uses parallel block decompression when the range spans multiple blocks.

Source

pub fn read_at_into( self: &Arc<Self>, stream: SnapshotStream, offset: u64, buffer: &mut [u8], ) -> Result<()>

Reads into a provided buffer. Unused suffix is zero-filled. Uses parallel decompression when spanning multiple blocks.

Source

pub fn read_at_into_uninit( self: &Arc<Self>, stream: SnapshotStream, offset: u64, buffer: &mut [MaybeUninit<u8>], ) -> Result<()>

Writes into uninitialized memory. Unused suffix is zero-filled. Uses parallel decompression when spanning multiple blocks.

On error: The buffer contents are undefined (possibly partially written).

Source

pub fn read_at_into_uninit_bytes( self: &Arc<Self>, stream: SnapshotStream, offset: u64, buf: &mut [u8], ) -> Result<()>

Like read_at_into_uninit but accepts &mut [u8]. Use from FFI (e.g. Python).

Auto Trait Implementations§

§

impl !Freeze for File

§

impl !RefUnwindSafe for File

§

impl Send for File

§

impl Sync for File

§

impl Unpin for File

§

impl UnsafeUnpin for File

§

impl !UnwindSafe for File

Blanket Implementations§

Source§

impl<T> Any for T
where T: 'static + ?Sized,

Source§

fn type_id(&self) -> TypeId

Gets the TypeId of self. Read more
Source§

impl<T> Borrow<T> for T
where T: ?Sized,

Source§

fn borrow(&self) -> &T

Immutably borrows from an owned value. Read more
Source§

impl<T> BorrowMut<T> for T
where T: ?Sized,

Source§

fn borrow_mut(&mut self) -> &mut T

Mutably borrows from an owned value. Read more
Source§

impl<T> From<T> for T

Source§

fn from(t: T) -> T

Returns the argument unchanged.

Source§

impl<T> Instrument for T

Source§

fn instrument(self, span: Span) -> Instrumented<Self>

Instruments this type with the provided Span, returning an Instrumented wrapper. Read more
Source§

fn in_current_span(self) -> Instrumented<Self>

Instruments this type with the current Span, returning an Instrumented wrapper. Read more
Source§

impl<T, U> Into<U> for T
where U: From<T>,

Source§

fn into(self) -> U

Calls U::from(self).

That is, this conversion is whatever the implementation of From<T> for U chooses to do.

Source§

impl<T> IntoEither for T

Source§

fn into_either(self, into_left: bool) -> Either<Self, Self>

Converts self into a Left variant of Either<Self, Self> if into_left is true. Converts self into a Right variant of Either<Self, Self> otherwise. Read more
Source§

fn into_either_with<F>(self, into_left: F) -> Either<Self, Self>
where F: FnOnce(&Self) -> bool,

Converts self into a Left variant of Either<Self, Self> if into_left(&self) returns true. Converts self into a Right variant of Either<Self, Self> otherwise. Read more
Source§

impl<T> Pointable for T

Source§

const ALIGN: usize

The alignment of pointer.
Source§

type Init = T

The type for initializers.
Source§

unsafe fn init(init: <T as Pointable>::Init) -> usize

Initializes a with the given initializer. Read more
Source§

unsafe fn deref<'a>(ptr: usize) -> &'a T

Dereferences the given pointer. Read more
Source§

unsafe fn deref_mut<'a>(ptr: usize) -> &'a mut T

Mutably dereferences the given pointer. Read more
Source§

unsafe fn drop(ptr: usize)

Drops the object pointed to by the given pointer. Read more
Source§

impl<T> Same for T

Source§

type Output = T

Should always be Self
Source§

impl<T, U> TryFrom<U> for T
where U: Into<T>,

Source§

type Error = Infallible

The type returned in the event of a conversion error.
Source§

fn try_from(value: U) -> Result<T, <T as TryFrom<U>>::Error>

Performs the conversion.
Source§

impl<T, U> TryInto<U> for T
where U: TryFrom<T>,

Source§

type Error = <U as TryFrom<T>>::Error

The type returned in the event of a conversion error.
Source§

fn try_into(self) -> Result<U, <U as TryFrom<T>>::Error>

Performs the conversion.
Source§

impl<V, T> VZip<V> for T
where V: MultiLane<T>,

Source§

fn vzip(self) -> V

Source§

impl<T> WithSubscriber for T

Source§

fn with_subscriber<S>(self, subscriber: S) -> WithDispatch<Self>
where S: Into<Dispatch>,

Attaches the provided Subscriber to this type, returning a WithDispatch wrapper. Read more
Source§

fn with_current_subscriber(self) -> WithDispatch<Self>

Attaches the current default Subscriber to this type, returning a WithDispatch wrapper. Read more