Seq

Trait Seq 

Source
pub trait Seq<'s>:
    Copy
    + Eq
    + Ord {
    type SeqVec: SeqVec;

    const BITS_PER_CHAR: usize;
    const BASES_PER_BYTE: usize;
Show 23 methods // Required methods fn len(&self) -> usize; fn is_empty(&self) -> bool; fn get(&self, _index: usize) -> u8; fn get_ascii(&self, _index: usize) -> u8; fn as_u64(&self) -> u64; fn revcomp_as_u64(&self) -> u64; fn as_u128(&self) -> u128; fn revcomp_as_u128(&self) -> u128; fn to_vec(&self) -> Self::SeqVec; fn to_revcomp(&self) -> Self::SeqVec; fn slice(&self, range: Range<usize>) -> Self; fn iter_bp(self) -> impl ExactSizeIterator<Item = u8>; fn par_iter_bp(self, context: usize) -> PaddedIt<impl ChunkIt<u32x8>>; fn par_iter_bp_delayed( self, context: usize, delay: Delay, ) -> PaddedIt<impl ChunkIt<(u32x8, u32x8)>>; fn par_iter_bp_delayed_2( self, context: usize, delay1: Delay, delay2: Delay, ) -> PaddedIt<impl ChunkIt<(u32x8, u32x8, u32x8)>>; fn cmp_lcp(&self, other: &Self) -> (Ordering, usize); // Provided methods fn bits_per_char(&self) -> usize { ... } fn to_word(&self) -> usize { ... } fn to_word_revcomp(&self) -> usize { ... } fn read_kmer(&self, k: usize, pos: usize) -> u64 { ... } fn read_revcomp_kmer(&self, k: usize, pos: usize) -> u64 { ... } fn read_kmer_u128(&self, k: usize, pos: usize) -> u128 { ... } fn read_revcomp_kmer_u128(&self, k: usize, pos: usize) -> u128 { ... }
}
Expand description

A non-owned slice of characters.

The represented character values are expected to be in [0, 2^b), but they can be encoded in various ways. E.g.:

  • A &[u8] of ASCII characters, returning 8-bit values.
  • An AsciiSeq of DNA characters ACGT, interpreted 2-bit values.
  • A PackedSeq of packed DNA characters (4 per byte), returning 2-bit values.

Each character is assumed to fit in 8 bits. Some functions take or return this ‘unpacked’ (ASCII) character.

Required Associated Constants§

Source

const BITS_PER_CHAR: usize

Number of bits b to represent each character returned by iter_bp and variants..

Source

const BASES_PER_BYTE: usize

Number of encoded characters per byte of memory of the Seq.

Required Associated Types§

Source

type SeqVec: SeqVec

The corresponding owned sequence type.

Required Methods§

Source

fn len(&self) -> usize

The length of the sequence in characters.

Source

fn is_empty(&self) -> bool

Returns true if the sequence is empty.

Source

fn get(&self, _index: usize) -> u8

Get the character at the given index.

Source

fn get_ascii(&self, _index: usize) -> u8

Get the ASCII character at the given index, without mapping to b-bit values.

Source

fn as_u64(&self) -> u64

Convert a short sequence (kmer) to a packed representation as u64.

Source

fn revcomp_as_u64(&self) -> u64

Convert a short sequence (kmer) to a packed representation of its reverse complement as u64.

Source

fn as_u128(&self) -> u128

Convert a short sequence (kmer) to a packed representation as u128.

Source

fn revcomp_as_u128(&self) -> u128

Convert a short sequence (kmer) to a packed representation of its reverse complement as u128.

Source

fn to_vec(&self) -> Self::SeqVec

Convert to an owned version.

Source

fn to_revcomp(&self) -> Self::SeqVec

Compute the reverse complement of this sequence.

Source

fn slice(&self, range: Range<usize>) -> Self

Get a sub-slice of the sequence. range indicates character indices.

Source

fn iter_bp(self) -> impl ExactSizeIterator<Item = u8>

Iterate over the b-bit characters of the sequence.

Source

fn par_iter_bp(self, context: usize) -> PaddedIt<impl ChunkIt<u32x8>>

Iterate over 8 chunks of b-bit characters of the sequence in parallel.

This splits the input into 8 chunks and streams over them in parallel. The second output returns the number of ‘padding’ characters that was added to get a full number of SIMD lanes. Thus, the last padding number of returned elements (from the last lane(s)) should be ignored. The context can be e.g. the k-mer size being iterated. When context>1, consecutive chunks overlap by context-1 bases.

Expected to be implemented using SIMD instructions.

Source

fn par_iter_bp_delayed( self, context: usize, delay: Delay, ) -> PaddedIt<impl ChunkIt<(u32x8, u32x8)>>

Iterate over 8 chunks of the sequence in parallel, returning two characters offset by delay positions.

Returned pairs are (add, remove), and the first delay ‘remove’ characters are always 0.

For example, when the sequence starts as ABCDEF..., and delay=2, the first returned tuples in the first lane are: (b'A', 0), (b'B', 0), (b'C', b'A'), (b'D', b'B').

When context>1, consecutive chunks overlap by context-1 bases: the first context-1 ‘added’ characters of the second chunk overlap with the last context-1 ‘added’ characters of the first chunk.

Source

fn par_iter_bp_delayed_2( self, context: usize, delay1: Delay, delay2: Delay, ) -> PaddedIt<impl ChunkIt<(u32x8, u32x8, u32x8)>>

Iterate over 8 chunks of the sequence in parallel, returning three characters: the char added, the one delay positions before, and the one delay2 positions before.

Requires delay1 <= delay2.

Returned pairs are (add, d1, d2). The first delay1 d1 characters and first delay2 d2 are always 0.

For example, when the sequence starts as ABCDEF..., and delay1=2 and delay2=3, the first returned tuples in the first lane are: (b'A', 0, 0), (b'B', 0, 0), (b'C', b'A', 0), (b'D', b'B', b'A').

When context>1, consecutive chunks overlap by context-1 bases: the first context-1 ‘added’ characters of the second chunk overlap with the last context-1 ‘added’ characters of the first chunk.

Source

fn cmp_lcp(&self, other: &Self) -> (Ordering, usize)

Compare and return the LCP of the two sequences.

Provided Methods§

Source

fn bits_per_char(&self) -> usize

Convenience function that returns b=Self::BITS_PER_CHAR.

Source

fn to_word(&self) -> usize

👎Deprecated: Prefer to_u64.

Convert a short sequence (kmer) to a packed representation as usize.

Source

fn to_word_revcomp(&self) -> usize

👎Deprecated: Prefer revcomp_to_u64.

Convert a short sequence (kmer) to a packed representation of its reverse complement as usize.

Source

fn read_kmer(&self, k: usize, pos: usize) -> u64

Extract a k-mer from this sequence.

Source

fn read_revcomp_kmer(&self, k: usize, pos: usize) -> u64

Extract a reverse complement k-mer from this sequence.

Source

fn read_kmer_u128(&self, k: usize, pos: usize) -> u128

Extract a k-mer from this sequence.

Source

fn read_revcomp_kmer_u128(&self, k: usize, pos: usize) -> u128

Extract a reverse complement k-mer from this sequence.

Dyn Compatibility§

This trait is not dyn compatible.

In older versions of Rust, dyn compatibility was called "object safety", so this trait is not object safe.

Implementations on Foreign Types§

Source§

impl Seq<'_> for &[u8]

Maps ASCII to [0, 4) on the fly. Prefer first packing into a PackedSeqVec for storage.

Source§

fn to_vec(&self) -> Vec<u8>

Convert to an owned version.

Source§

fn iter_bp(self) -> impl ExactSizeIterator<Item = u8>

Iter the ASCII characters.

Source§

fn par_iter_bp(self, context: usize) -> PaddedIt<impl ChunkIt<u32x8>>

Iter the ASCII characters in parallel.

Source§

const BASES_PER_BYTE: usize = 1usize

Source§

const BITS_PER_CHAR: usize = 8usize

Source§

type SeqVec = Vec<u8>

Source§

fn len(&self) -> usize

Source§

fn is_empty(&self) -> bool

Source§

fn get(&self, index: usize) -> u8

Source§

fn get_ascii(&self, index: usize) -> u8

Source§

fn as_u64(&self) -> u64

Source§

fn revcomp_as_u64(&self) -> u64

Source§

fn as_u128(&self) -> u128

Source§

fn revcomp_as_u128(&self) -> u128

Source§

fn to_revcomp(&self) -> Vec<u8>

Source§

fn slice(&self, range: Range<usize>) -> Self

Source§

fn par_iter_bp_delayed( self, context: usize, Delay: Delay, ) -> PaddedIt<impl ChunkIt<(u32x8, u32x8)>>

Source§

fn par_iter_bp_delayed_2( self, context: usize, Delay: Delay, Delay: Delay, ) -> PaddedIt<impl ChunkIt<(u32x8, u32x8, u32x8)>>

Source§

fn cmp_lcp(&self, other: &Self) -> (Ordering, usize)

Implementors§

Source§

impl<'s> Seq<'s> for AsciiSeq<'s>

Maps ASCII to [0, 4) on the fly. Prefer first packing into a PackedSeqVec for storage.