Skip to main content

ReadSet

Struct ReadSet 

Source
pub struct ReadSet { /* private fields */ }
Expand description

A set of reads extracted from GAFBase.

This is a counterpart to Subgraph. Sets of reads fully contained in a subgraph or overlapping with it can be created using ReadSet::new. The reads can be iterated over with ReadSet::iter and converted to GAF lines with ReadSet::to_gaf. The reads will appear in the same order as in the database.

§Examples

use gbz_base::{Subgraph, SubgraphQuery, HaplotypeOutput};
use gbz_base::{GAFBase, GAFBaseParams, ReadSet, AlignmentOutput, GraphReference};
use gbz_base::utils;
use gbz::GBZ;
use simple_sds::serialize;

// Get an in-memory graph.
let gbz_file = utils::get_test_data("micb-kir3dl1.gbz");
let graph = serialize::load_from(&gbz_file).unwrap();

// Extract a 100 bp subgraph around node 150.
let nodes = vec![150];
let query = SubgraphQuery::nodes(nodes).with_output(HaplotypeOutput::Distinct);
let mut subgraph = Subgraph::new();
let _ = subgraph.from_gbz(&graph, None, None, &query).unwrap();

// Create a database of reads aligned to the graph.
let gaf_file = utils::get_test_data("micb-kir3dl1_HG003.gaf");
let gbwt_file = None; // Build a new GBWT index.
let db_file = serialize::temp_file_name("gaf-base");
let graph_ref = GraphReference::None; // Do not store sequences in the database.
let params = GAFBaseParams::default();
let db = GAFBase::create_from_files(&gaf_file, gbwt_file, &db_file, graph_ref, &params);
assert!(db.is_ok());

// Extract all reads fully within the subgraph.
let db = GAFBase::open(&db_file);
assert!(db.is_ok());
let db = db.unwrap();
let read_set = ReadSet::new(GraphReference::Gbz(&graph), &subgraph, &db, AlignmentOutput::Contained);
assert!(read_set.is_ok());
let read_set = read_set.unwrap();
assert_eq!(read_set.len(), 148);

// The extracted reads are aligned and fully within the subgraph.
for aln in read_set.iter() {
    for handle in aln.target_path().unwrap() {
        assert!(subgraph.has_handle(*handle));
    }
}

drop(db);
let _ = std::fs::remove_file(&db_file);

Implementations§

Source§

impl ReadSet

Source

pub const CLUSTER_GAP_THRESHOLD: usize = 1000

Gap length threshold for clustering node ids.

Source

pub fn new( graph: GraphReference<'_, '_>, subgraph: &Subgraph, database: &GAFBase, output: AlignmentOutput, ) -> Result<Self, String>

Extracts a set of reads overlapping with the subgraph.

The extracted reads will be in the same order as in the database. That corresponds to the order in the original GAF file.

§Arguments
  • graph: A GBZ-compatible graph for querying a reference-based GAF-base, or no graph for a reference-free one.
  • subgraph: The subgraph used as the query region.
  • database: A database storing reads aligned to the graph.
  • output: Which reads to include in the read set.
§Errors

Passes through any database errors. Returns an error if an alignment cannot be decompressed.

Source

pub fn from_rows( database: &GAFBase, row_range: Range<usize>, graph: Option<&GBZ>, ) -> Result<Self, String>

Extracts all reads from the given range of row ids.

The extracted reads will be in the same order as in the database. That corresponds to the order in the original GAF file.

§Arguments
  • database: A database storing reads aligned to the graph.
  • row_range: The range of row ids to extract.
  • graph: A GBZ graph if the database is reference-based, or None for a reference-free one.
§Errors

Passes through any database errors. Returns an error if an alignment cannot be decompressed.

Source

pub fn len(&self) -> usize

Returns the number of alignment fragments in the set.

Source

pub fn is_empty(&self) -> bool

Returns true if the set is empty.

Source

pub fn unclipped(&self) -> usize

Returns the original number of alignments (before clipping) in the set.

Source

pub fn blocks(&self) -> usize

Returns the number of alignment blocks decompressed when creating the read set.

Source

pub fn node_records(&self) -> usize

Returns the number of node records in the read set.

Each record corresponds to an oriented node, and the opposite orientation may not be present. This includes all node records encountered while tracing the alignments, even when the alignment was not included in the read set.

Source

pub fn clusters(&self) -> usize

Returns the number of node id clusters in the subgraph.

Source

pub fn iter(&self) -> impl Iterator<Item = &Alignment>

Returns an iterator over the reads in the set.

Source

pub fn to_gaf<W: Write>(&self, writer: &mut W) -> Result<(), String>

Serializes the read set in the GAF format.

The output does not include any header lines, as the GAF file may consist of multiple read sets. Returns an error if the target sequence for a read is invalid or cannot be determined. Passes through any I/O errors.

Trait Implementations§

Source§

impl Clone for ReadSet

Source§

fn clone(&self) -> ReadSet

Returns a duplicate of the value. Read more
1.0.0 · Source§

fn clone_from(&mut self, source: &Self)

Performs copy-assignment from source. Read more
Source§

impl Debug for ReadSet

Source§

fn fmt(&self, f: &mut Formatter<'_>) -> Result

Formats the value using the given formatter. Read more
Source§

impl Default for ReadSet

Source§

fn default() -> ReadSet

Returns the “default value” for a type. Read more
Source§

impl PartialEq for ReadSet

Source§

fn eq(&self, other: &ReadSet) -> bool

Tests for self and other values to be equal, and is used by ==.
1.0.0 · Source§

fn ne(&self, other: &Rhs) -> bool

Tests for !=. The default implementation is almost always sufficient, and should not be overridden without very good reason.
Source§

impl StructuralPartialEq for ReadSet

Auto Trait Implementations§

Blanket Implementations§

Source§

impl<T> Any for T
where T: 'static + ?Sized,

Source§

fn type_id(&self) -> TypeId

Gets the TypeId of self. Read more
Source§

impl<T> Borrow<T> for T
where T: ?Sized,

Source§

fn borrow(&self) -> &T

Immutably borrows from an owned value. Read more
Source§

impl<T> BorrowMut<T> for T
where T: ?Sized,

Source§

fn borrow_mut(&mut self) -> &mut T

Mutably borrows from an owned value. Read more
Source§

impl<T> CloneToUninit for T
where T: Clone,

Source§

unsafe fn clone_to_uninit(&self, dest: *mut u8)

🔬This is a nightly-only experimental API. (clone_to_uninit)
Performs copy-assignment from self to dest. Read more
Source§

impl<T> From<T> for T

Source§

fn from(t: T) -> T

Returns the argument unchanged.

Source§

impl<T, U> Into<U> for T
where U: From<T>,

Source§

fn into(self) -> U

Calls U::from(self).

That is, this conversion is whatever the implementation of From<T> for U chooses to do.

Source§

impl<T> Same for T

Source§

type Output = T

Should always be Self
Source§

impl<T> ToOwned for T
where T: Clone,

Source§

type Owned = T

The resulting type after obtaining ownership.
Source§

fn to_owned(&self) -> T

Creates owned data from borrowed data, usually by cloning. Read more
Source§

fn clone_into(&self, target: &mut T)

Uses borrowed data to replace owned data, usually by cloning. Read more
Source§

impl<T, U> TryFrom<U> for T
where U: Into<T>,

Source§

type Error = Infallible

The type returned in the event of a conversion error.
Source§

fn try_from(value: U) -> Result<T, <T as TryFrom<U>>::Error>

Performs the conversion.
Source§

impl<T, U> TryInto<U> for T
where U: TryFrom<T>,

Source§

type Error = <U as TryFrom<T>>::Error

The type returned in the event of a conversion error.
Source§

fn try_into(self) -> Result<U, <U as TryFrom<T>>::Error>

Performs the conversion.