MmapReader

Struct MmapReader 

Source
pub struct MmapReader { /* private fields */ }
Expand description

A memory-mapped reader for binary sequence files

This reader provides efficient access to binary sequence files by memory-mapping them instead of performing traditional I/O operations. It supports both sequential access to individual records and parallel processing of records across multiple threads.

The reader ensures thread-safety through the use of Arc for sharing the memory-mapped data between threads.

Records are returned as RefRecord which implement the BinseqRecord trait.

§Examples

use binseq::bq::MmapReader;
use binseq::Result;

fn main() -> Result<()> {
    let path = "./data/subset.bq";
    let reader = MmapReader::new(path)?;

    // Calculate the number of records in the file
    let num_records = reader.num_records();
    println!("Number of records: {}", num_records);

    // Get the record at index 20 (0-indexed)
    let record = reader.get(20)?;

    Ok(())
}

Implementations§

Source§

impl MmapReader

Source

pub fn new<P>(path: P) -> Result<MmapReader, Error>
where P: AsRef<Path>,

Creates a new memory-mapped reader for a binary sequence file

This method opens the file, memory-maps its contents, and validates the file structure to ensure it contains valid binary sequence data.

§Arguments
  • path - Path to the binary sequence file
§Returns
  • Ok(MmapReader) - A new reader if the file is valid
  • Err(Error) - If the file is invalid or cannot be opened
§Errors

Returns an error if:

  • The file cannot be opened
  • The file is not a regular file
  • The file header is invalid
  • The file size doesn’t match the expected size based on the header
Source

pub fn num_records(&self) -> usize

Returns the total number of records in the file

This is calculated by subtracting the header size from the total file size and dividing by the size of each record.

Source

pub fn header(&self) -> BinseqHeader

Returns a copy of the binary sequence file header

The header contains format information and sequence length specifications.

Source

pub fn is_paired(&self) -> bool

Checks if the file has paired-records

Source

pub fn get(&self, idx: usize) -> Result<RefRecord<'_>, Error>

Returns a reference to a specific record

§Arguments
  • idx - The index of the record to retrieve (0-based)
§Returns
  • Ok(RefRecord) - A reference to the requested record
  • Err(Error) - If the index is out of bounds
§Errors

Returns an error if the requested index is beyond the number of records in the file

Source

pub fn get_buffer_slice(&self, range: Range<usize>) -> Result<&[u64], Error>

Returns a slice of the buffer containing the underlying u64 for that range of records.

Note: range 10..40 will return all u64s in the mmap between the record index 10 and 40

Trait Implementations§

Source§

impl ParallelReader for MmapReader

Parallel processing implementation for memory-mapped readers

Source§

fn process_parallel<P>( self, processor: P, num_threads: usize, ) -> Result<(), Error>
where P: ParallelProcessor + Clone + 'static,

Processes all records in parallel using multiple threads

This method distributes the records across the specified number of threads and processes them using the provided processor. Each thread receives its own clone of the processor and processes a contiguous chunk of records.

§Arguments
  • processor - The processor to use for handling records
  • num_threads - The number of threads to use for processing
§Type Parameters
  • P - A type that implements ParallelProcessor and can be cloned
§Returns
  • Ok(()) - If all records were processed successfully
  • Err(Error) - If an error occurred during processing
Source§

fn process_parallel_range<P>( self, processor: P, num_threads: usize, range: Range<usize>, ) -> Result<(), Error>
where P: ParallelProcessor + Clone + 'static,

Process records in parallel within a specified range

This method allows parallel processing of a subset of records within the file, defined by a start and end index. The range is distributed across the specified number of threads.

§Arguments
  • processor - The processor to use for each record
  • num_threads - The number of threads to spawn
  • range - The range of record indices to process
§Type Parameters
  • P - A type that implements ParallelProcessor and can be cloned
§Returns
  • Ok(()) - If all records were processed successfully
  • Err(Error) - If an error occurred during processing

Auto Trait Implementations§

Blanket Implementations§

Source§

impl<T> Any for T
where T: 'static + ?Sized,

Source§

fn type_id(&self) -> TypeId

Gets the TypeId of self. Read more
Source§

impl<T> Borrow<T> for T
where T: ?Sized,

Source§

fn borrow(&self) -> &T

Immutably borrows from an owned value. Read more
Source§

impl<T> BorrowMut<T> for T
where T: ?Sized,

Source§

fn borrow_mut(&mut self) -> &mut T

Mutably borrows from an owned value. Read more
Source§

impl<T> From<T> for T

Source§

fn from(t: T) -> T

Returns the argument unchanged.

Source§

impl<T, U> Into<U> for T
where U: From<T>,

Source§

fn into(self) -> U

Calls U::from(self).

That is, this conversion is whatever the implementation of From<T> for U chooses to do.

Source§

impl<T> IntoEither for T

Source§

fn into_either(self, into_left: bool) -> Either<Self, Self>

Converts self into a Left variant of Either<Self, Self> if into_left is true. Converts self into a Right variant of Either<Self, Self> otherwise. Read more
Source§

fn into_either_with<F>(self, into_left: F) -> Either<Self, Self>
where F: FnOnce(&Self) -> bool,

Converts self into a Left variant of Either<Self, Self> if into_left(&self) returns true. Converts self into a Right variant of Either<Self, Self> otherwise. Read more
Source§

impl<T> Pointable for T

Source§

const ALIGN: usize

The alignment of pointer.
Source§

type Init = T

The type for initializers.
Source§

unsafe fn init(init: <T as Pointable>::Init) -> usize

Initializes a with the given initializer. Read more
Source§

unsafe fn deref<'a>(ptr: usize) -> &'a T

Dereferences the given pointer. Read more
Source§

unsafe fn deref_mut<'a>(ptr: usize) -> &'a mut T

Mutably dereferences the given pointer. Read more
Source§

unsafe fn drop(ptr: usize)

Drops the object pointed to by the given pointer. Read more
Source§

impl<T, U> TryFrom<U> for T
where U: Into<T>,

Source§

type Error = Infallible

The type returned in the event of a conversion error.
Source§

fn try_from(value: U) -> Result<T, <T as TryFrom<U>>::Error>

Performs the conversion.
Source§

impl<T, U> TryInto<U> for T
where U: TryFrom<T>,

Source§

type Error = <U as TryFrom<T>>::Error

The type returned in the event of a conversion error.
Source§

fn try_into(self) -> Result<U, <U as TryFrom<T>>::Error>

Performs the conversion.
Source§

impl<V, T> VZip<V> for T
where V: MultiLane<T>,

Source§

fn vzip(self) -> V