BamRecordExtensions

Trait BamRecordExtensions 

Source
pub trait BamRecordExtensions {
Show 18 methods // Required methods fn matched_sequence(&self) -> Option<MatchSequence>; fn aligned_blocks(&self) -> IterAlignedBlocks ; fn aligned_blocks_match(&self) -> Option<IterAlignedBlocks>; fn aligned_block_pairs(&self) -> IterAlignedBlockPairs ; fn aligned_block_pairs_match(&self) -> Option<IterAlignedBlockPairs>; fn introns(&self) -> IterIntrons ; fn aligned_pairs_match(&self) -> Option<IterAlignedPairs>; fn aligned_pairs(&self) -> IterAlignedPairs ; fn aligned_pairs_full(&self) -> IterAlignedPairsFull ; fn cigar_stats_nucleotides(&self) -> HashMap<Cigar, i32>; fn cigar_stats_blocks(&self) -> HashMap<Cigar, i32>; fn reference_positions(&self) -> Box<dyn Iterator<Item = i64>>; fn reference_positions_match(&self) -> Option<Box<dyn Iterator<Item = i64>>>; fn reference_positions_full(&self) -> Box<dyn Iterator<Item = Option<i64>>>; fn reference_start(&self) -> i64; fn reference_end(&self) -> i64; fn seq_len_from_cigar(&self, include_hard_clip: bool) -> usize; fn getcsaligned(&self) -> Option<CgAlignedBlockPairs>;
}
Expand description

Extra functionality for BAM records

Inspired by pysam

Required Methods§

Source

fn matched_sequence(&self) -> Option<MatchSequence>

iterator over start and end positions with the sequence of matched blocks. Return None if CIGAR does not contain any = or sequence is not present.

Source

fn aligned_blocks(&self) -> IterAlignedBlocks

iterator over start and end positions of aligned gapless blocks

The start and end positions are in genomic coordinates. There is not necessarily a gap between blocks on the genome, this happens on insertions.

pysam: blocks See also: aligned_block_pairs if you need the read coordinates as well.

Source

fn aligned_blocks_match(&self) -> Option<IterAlignedBlocks>

iterator over start and end positions of aligned matched blocks. Return None if CIGAR does not contain any = or sequence is not present.

The start and end positions are in genomic coordinates. There is not necessarily a gap between blocks on the genome, this happens on insertions.

Source

fn aligned_block_pairs(&self) -> IterAlignedBlockPairs

Iter over <([read_start, read_stop], [genome_start, genome_stop]) blocks of continously aligned reads.

In contrast to aligned_blocks, this returns read and genome coordinates. In contrast to aligned_pairs, this returns just the start-stop coordinates of each block.

There is not necessarily a gap between blocks in either coordinate space (this happens in in-dels).

Source

fn aligned_block_pairs_match(&self) -> Option<IterAlignedBlockPairs>

Iter over <([read_start, read_stop], [genome_start, genome_stop]) blocks of continously matched reads. Return None if CIGAR does not contain any = or sequence is not present.

In contrast to aligned_blocks, this returns read and genome coordinates. In contrast to aligned_pairs, this returns just the start-stop coordinates of each block.

There is not necessarily a gap between blocks in either coordinate space (this happens in in-dels).

Source

fn introns(&self) -> IterIntrons

This scans the CIGAR for reference skips and reports their positions. It does not inspect the reported regions for actual splice sites. pysam: get_introns

Source

fn aligned_pairs_match(&self) -> Option<IterAlignedPairs>

iter aligned read and reference positions on a basepair level

Return None if CIGAR does not contain any = or sequence is not present.

No entry for mismatches, insertions, deletions or skipped pairs

pysam: get_aligned_pairs(matches_only = True) but only matched bases

See also aligned_block_pairs if you just need start&end coordinates of each block. That way you can allocate less memory for the same informational content.

Source

fn aligned_pairs(&self) -> IterAlignedPairs

iter aligned read and reference positions on a basepair level

No entry for insertions, deletions or skipped pairs

pysam: get_aligned_pairs(matches_only = True)

See also aligned_block_pairs if you just need start&end coordinates of each block. That way you can allocate less memory for the same informational content.

Source

fn aligned_pairs_full(&self) -> IterAlignedPairsFull

iter list of read and reference positions on a basepair level with cigar.

Unlike aligned_pairs this returns None in either the read positions or the reference position for insertions, deletions or skipped pairs

pysam: aligned_pairs(matches_only = False) + cigar. For compatibility call .map(|p| [p.0,p.1]) on the result

Source

fn cigar_stats_nucleotides(&self) -> HashMap<Cigar, i32>

the number of nucleotides covered by each Cigar::* variant.

Result is a Hashmap Cigar::*(0) => covered nucleotides

pysam: first result from get_cigar_stats

Source

fn cigar_stats_blocks(&self) -> HashMap<Cigar, i32>

the number of occurrences of each each Cigar::* variant

Result is a Hashmap Cigar::*(0) => number of times this Cigar:: appeared

pysam: second result from get_cigar_stats

Source

fn reference_positions(&self) -> Box<dyn Iterator<Item = i64>>

iter over reference positions that this read aligns to

only returns positions that are aligned, excluding any soft-clipped or unaligned positions within the read

pysam: get_reference_positions(full_length=False)

Source

fn reference_positions_match(&self) -> Option<Box<dyn Iterator<Item = i64>>>

iter over reference positions that this read aligns to

only returns positions that are matched, excluding any soft-clipped or mismatched positions, unaligned positions within the read. Return None if cigar has no =

pysam: get_reference_positions(full_length=False)

Source

fn reference_positions_full(&self) -> Box<dyn Iterator<Item = Option<i64>>>

iter over reference positions that this read aligns to

include soft-clipped or skipped positions as None

pysam: get_reference_positions(full_length=True)

Source

fn reference_start(&self) -> i64

left most aligned reference position of the read on the reference genome.

Source

fn reference_end(&self) -> i64

right most aligned reference position of the read on the reference genome.

Source

fn seq_len_from_cigar(&self, include_hard_clip: bool) -> usize

infer the query length from the cigar string, optionally include hard clipped bases

Contrast with record::seq_len which returns the length of the sequence stored in the BAM file, and as such is 0 if the BAM file omits sequences

pysam: infer_query_length / infer_read_length

Source

fn getcsaligned(&self) -> Option<CgAlignedBlockPairs>

Get CS tag and perfect match alignment from BAM, try to use MD if existing else None if extension does not exist

Implementors§