Trait rust_htslib::bam::ext::BamRecordExtensions[][src]

pub trait BamRecordExtensions {
    fn aligned_blocks(&self) -> IterAlignedBlocks

Notable traits for IterAlignedBlocks

impl Iterator for IterAlignedBlocks type Item = [i64; 2];
;
fn aligned_block_pairs(&self) -> IterAlignedBlockPairs;
fn introns(&self) -> IterIntrons

Notable traits for IterIntrons

impl Iterator for IterIntrons type Item = [i64; 2];
;
fn aligned_pairs(&self) -> IterAlignedPairs

Notable traits for IterAlignedPairs

impl Iterator for IterAlignedPairs type Item = [i64; 2];
;
fn aligned_pairs_full(&self) -> IterAlignedPairsFull;
fn cigar_stats_nucleotides(&self) -> HashMap<Cigar, i32>;
fn cigar_stats_blocks(&self) -> HashMap<Cigar, i32>;
fn reference_positions(&self) -> Box<dyn Iterator<Item = i64>>;
fn reference_positions_full(&self) -> Box<dyn Iterator<Item = Option<i64>>>;
fn reference_start(&self) -> i64;
fn reference_end(&self) -> i64;
fn seq_len_from_cigar(&self, include_hard_clip: bool) -> usize; }
Expand description

Extra functionality for BAM records

Inspired by pysam

Required methods

iterator over start and end positions of aligned gapless blocks

The start and end positions are in genomic coordinates. There is not necessarily a gap between blocks on the genome, this happens on insertions.

pysam: blocks See also: aligned_block_pairs if you need the read coordinates as well.

Iter over <([read_start, read_stop], [genome_start, genome_stop]) blocks of continously aligned reads.

In contrast to aligned_blocks, this returns read and genome coordinates. In contrast to aligned_pairs, this returns just the start-stop coordinates of each block.

There is not necessarily a gap between blocks in either coordinate space (this happens in in-dels).

This scans the CIGAR for reference skips and reports their positions. It does not inspect the reported regions for actual splice sites. pysam: get_introns

iter aligned read and reference positions on a basepair level

No entry for insertions, deletions or skipped pairs

pysam: get_aligned_pairs(matches_only = True)

See also aligned_block_pairs if you just need start&end coordinates of each block. That way you can allocate less memory for the same informational content.

iter list of read and reference positions on a basepair level.

Unlike aligned_pairs this returns None in either the read positions or the reference position for insertions, deletions or skipped pairs

pysam: aligned_pairs(matches_only = False)

the number of nucleotides covered by each Cigar::* variant.

Result is a Hashmap Cigar::*(0) => covered nucleotides

pysam: first result from get_cigar_stats

the number of occurrences of each each Cigar::* variant

Result is a Hashmap Cigar::*(0) => number of times this Cigar:: appeared

pysam: second result from get_cigar_stats

iter over reference positions that this read aligns to

only returns positions that are aligned, excluding any soft-clipped or unaligned positions within the read

pysam: get_reference_positions(full_length=False)

iter over reference positions that this read aligns to

include soft-clipped or skipped positions as None

pysam: get_reference_positions(full_length=True)

left most aligned reference position of the read on the reference genome.

right most aligned reference position of the read on the reference genome.

infer the query length from the cigar string, optionally include hard clipped bases

Contrast with record::seq_len which returns the length of the sequence stored in the BAM file, and as such is 0 if the BAM file omits sequences

pysam: infer_query_length / infer_read_length

Implementors