Skip to main content

Scanner

Struct Scanner 

Source
pub struct Scanner { /* private fields */ }
Expand description

A BAM scanner.

Schema parameters (fields, tag definitions) are declared at construction time. Scan methods accept only column projection, batch size, and limit.

§Examples

use oxbow::alignment::scanner::bam::Scanner;
use oxbow::Select;
use std::fs::File;
use std::io::BufReader;

let inner = File::open("sample.bam").map(BufReader::new).unwrap();
let mut fmt_reader = noodles::bam::io::Reader::new(inner);
let header = fmt_reader.read_header().unwrap();

let tag_defs = Scanner::tag_defs(&mut fmt_reader, Some(1000)).unwrap();
let scanner = Scanner::new(header, Select::All, Some(tag_defs)).unwrap();
let batches = scanner.scan(fmt_reader, None, None, Some(1000));

Implementations§

Source§

impl Scanner

Source

pub fn new( header: Header, fields: Select<String>, tag_defs: Option<Vec<(String, String)>>, ) -> Result<Self>

Creates a BAM scanner from a SAM header and schema parameters.

  • fields: standard SAM field selection.
  • tag_defs: None → no tags column. Some(vec![]) → empty struct.
Source

pub fn with_model(header: Header, model: AlignmentModel) -> Self

Creates a BAM scanner from an AlignmentModel.

Source

pub fn model(&self) -> &AlignmentModel

Returns a reference to the AlignmentModel.

Source

pub fn header(&self) -> &Header

Returns a reference to the SAM header.

Source

pub fn chrom_names(&self) -> Vec<String>

Returns the reference sequence names.

Source

pub fn chrom_sizes(&self) -> Vec<(String, u32)>

Returns the reference sequence names and lengths.

Source

pub fn field_names(&self) -> Vec<String>

Returns the field names declared in the model.

Source

pub fn schema(&self) -> &Schema

Returns the Arrow schema.

Source§

impl Scanner

Source

pub fn tag_defs<R: Read>( fmt_reader: &mut Reader<R>, scan_rows: Option<usize>, ) -> Result<Vec<(String, String)>>

Discovers tag definitions by scanning over records.

The scan will begin at the current position of the reader and will move the cursor to the end of the last record scanned.

Source

pub fn scan<R: Read>( &self, fmt_reader: Reader<R>, columns: Option<Vec<String>>, batch_size: Option<usize>, limit: Option<usize>, ) -> Result<impl RecordBatchReader>

Returns an iterator yielding record batches.

The scan will begin at the current position of the reader and will move the cursor to the end of the last record scanned.

Source

pub fn scan_query<R: BufRead + Seek>( &self, fmt_reader: Reader<R>, region: Region, index: impl BinningIndex, columns: Option<Vec<String>>, batch_size: Option<usize>, limit: Option<usize>, ) -> Result<impl RecordBatchReader>

Returns an iterator yielding record batches satisfying a genomic range query.

This operation requires a BGZF source and an Index.

The scan will traverse one or more virtual position ranges and filter for records that overlap the given region. The cursor will stop at the end of the last record scanned.

Source

pub fn scan_unmapped<R: BufRead + Seek>( &self, fmt_reader: Reader<R>, index: impl BinningIndex, columns: Option<Vec<String>>, batch_size: Option<usize>, limit: Option<usize>, ) -> Result<impl RecordBatchReader>

Returns an iterator yielding record batches of unaligned reads.

This operation requires a BGZF source and an Index.

The scan will start at the offset where unmapped reads begin and continue until the source stream is exhausted.

Source

pub fn scan_byte_ranges<R: BufRead + Seek>( &self, fmt_reader: Reader<R>, byte_ranges: Vec<(u64, u64)>, columns: Option<Vec<String>>, batch_size: Option<usize>, limit: Option<usize>, ) -> Result<impl RecordBatchReader>

Returns an iterator yielding record batches from specified byte ranges.

This operation requires a seekable (typically uncompressed) source.

The scan will traverse the specified byte ranges without filtering by genomic coordinates. This is useful when you have pre-computed file offsets from a custom index. The byte ranges must align with record boundaries.

Source

pub fn scan_virtual_ranges<R: BufRead + Seek>( &self, fmt_reader: Reader<R>, vpos_ranges: Vec<(VirtualPosition, VirtualPosition)>, columns: Option<Vec<String>>, batch_size: Option<usize>, limit: Option<usize>, ) -> Result<impl RecordBatchReader>

Returns an iterator yielding record batches from specified virtual position ranges.

This operation requires a BGZF-compressed source.

The scan will traverse the specified virtual position ranges without filtering by genomic coordinates. This is useful when you have pre-computed virtual offsets from a custom index.

Auto Trait Implementations§

Blanket Implementations§

Source§

impl<T> Any for T
where T: 'static + ?Sized,

Source§

fn type_id(&self) -> TypeId

Gets the TypeId of self. Read more
Source§

impl<T> Borrow<T> for T
where T: ?Sized,

Source§

fn borrow(&self) -> &T

Immutably borrows from an owned value. Read more
Source§

impl<T> BorrowMut<T> for T
where T: ?Sized,

Source§

fn borrow_mut(&mut self) -> &mut T

Mutably borrows from an owned value. Read more
Source§

impl<T> From<T> for T

Source§

fn from(t: T) -> T

Returns the argument unchanged.

Source§

impl<T, U> Into<U> for T
where U: From<T>,

Source§

fn into(self) -> U

Calls U::from(self).

That is, this conversion is whatever the implementation of From<T> for U chooses to do.

Source§

impl<T> IntoEither for T

Source§

fn into_either(self, into_left: bool) -> Either<Self, Self>

Converts self into a Left variant of Either<Self, Self> if into_left is true. Converts self into a Right variant of Either<Self, Self> otherwise. Read more
Source§

fn into_either_with<F>(self, into_left: F) -> Either<Self, Self>
where F: FnOnce(&Self) -> bool,

Converts self into a Left variant of Either<Self, Self> if into_left(&self) returns true. Converts self into a Right variant of Either<Self, Self> otherwise. Read more
Source§

impl<T> Same for T

Source§

type Output = T

Should always be Self
Source§

impl<T, U> TryFrom<U> for T
where U: Into<T>,

Source§

type Error = Infallible

The type returned in the event of a conversion error.
Source§

fn try_from(value: U) -> Result<T, <T as TryFrom<U>>::Error>

Performs the conversion.
Source§

impl<T, U> TryInto<U> for T
where U: TryFrom<T>,

Source§

type Error = <U as TryFrom<T>>::Error

The type returned in the event of a conversion error.
Source§

fn try_into(self) -> Result<U, <U as TryFrom<T>>::Error>

Performs the conversion.
Source§

impl<T> Allocation for T
where T: RefUnwindSafe + Send + Sync,