pub trait Seq<'s>:
Copy
+ Eq
+ Ord {
type SeqVec: SeqVec;
const BITS_PER_CHAR: usize;
const BASES_PER_BYTE: usize;
Show 23 methods
// Required methods
fn len(&self) -> usize;
fn is_empty(&self) -> bool;
fn get(&self, _index: usize) -> u8;
fn get_ascii(&self, _index: usize) -> u8;
fn as_u64(&self) -> u64;
fn revcomp_as_u64(&self) -> u64;
fn as_u128(&self) -> u128;
fn revcomp_as_u128(&self) -> u128;
fn to_vec(&self) -> Self::SeqVec;
fn to_revcomp(&self) -> Self::SeqVec;
fn slice(&self, range: Range<usize>) -> Self;
fn iter_bp(self) -> impl ExactSizeIterator<Item = u8>;
fn par_iter_bp(self, context: usize) -> PaddedIt<impl ChunkIt<u32x8>>;
fn par_iter_bp_delayed(
self,
context: usize,
delay: Delay,
) -> PaddedIt<impl ChunkIt<(u32x8, u32x8)>>;
fn par_iter_bp_delayed_2(
self,
context: usize,
delay1: Delay,
delay2: Delay,
) -> PaddedIt<impl ChunkIt<(u32x8, u32x8, u32x8)>>;
fn cmp_lcp(&self, other: &Self) -> (Ordering, usize);
// Provided methods
fn bits_per_char(&self) -> usize { ... }
fn to_word(&self) -> usize { ... }
fn to_word_revcomp(&self) -> usize { ... }
fn read_kmer(&self, k: usize, pos: usize) -> u64 { ... }
fn read_revcomp_kmer(&self, k: usize, pos: usize) -> u64 { ... }
fn read_kmer_u128(&self, k: usize, pos: usize) -> u128 { ... }
fn read_revcomp_kmer_u128(&self, k: usize, pos: usize) -> u128 { ... }
}
Expand description
A non-owned slice of characters.
The represented character values are expected to be in [0, 2^b)
,
but they can be encoded in various ways. E.g.:
- A
&[u8]
of ASCII characters, returning 8-bit values. - An
AsciiSeq
of DNA charactersACGT
, interpreted 2-bit values. - A
PackedSeq
of packed DNA characters (4 per byte), returning 2-bit values.
Each character is assumed to fit in 8 bits. Some functions take or return this ‘unpacked’ (ASCII) character.
Required Associated Constants§
Sourceconst BITS_PER_CHAR: usize
const BITS_PER_CHAR: usize
Number of bits b
to represent each character returned by iter_bp
and variants..
Sourceconst BASES_PER_BYTE: usize
const BASES_PER_BYTE: usize
Number of encoded characters per byte of memory of the Seq
.
Required Associated Types§
Required Methods§
Sourcefn get_ascii(&self, _index: usize) -> u8
fn get_ascii(&self, _index: usize) -> u8
Get the ASCII character at the given index, without mapping to b
-bit values.
Sourcefn revcomp_as_u64(&self) -> u64
fn revcomp_as_u64(&self) -> u64
Convert a short sequence (kmer) to a packed representation of its reverse complement as u64
.
Sourcefn revcomp_as_u128(&self) -> u128
fn revcomp_as_u128(&self) -> u128
Convert a short sequence (kmer) to a packed representation of its reverse complement as u128
.
Sourcefn to_revcomp(&self) -> Self::SeqVec
fn to_revcomp(&self) -> Self::SeqVec
Compute the reverse complement of this sequence.
Sourcefn slice(&self, range: Range<usize>) -> Self
fn slice(&self, range: Range<usize>) -> Self
Get a sub-slice of the sequence.
range
indicates character indices.
Sourcefn iter_bp(self) -> impl ExactSizeIterator<Item = u8>
fn iter_bp(self) -> impl ExactSizeIterator<Item = u8>
Iterate over the b
-bit characters of the sequence.
Sourcefn par_iter_bp(self, context: usize) -> PaddedIt<impl ChunkIt<u32x8>>
fn par_iter_bp(self, context: usize) -> PaddedIt<impl ChunkIt<u32x8>>
Iterate over 8 chunks of b
-bit characters of the sequence in parallel.
This splits the input into 8 chunks and streams over them in parallel.
The second output returns the number of ‘padding’ characters that was added to get a full number of SIMD lanes.
Thus, the last padding
number of returned elements (from the last lane(s)) should be ignored.
The context can be e.g. the k-mer size being iterated.
When context>1
, consecutive chunks overlap by context-1
bases.
Expected to be implemented using SIMD instructions.
Sourcefn par_iter_bp_delayed(
self,
context: usize,
delay: Delay,
) -> PaddedIt<impl ChunkIt<(u32x8, u32x8)>>
fn par_iter_bp_delayed( self, context: usize, delay: Delay, ) -> PaddedIt<impl ChunkIt<(u32x8, u32x8)>>
Iterate over 8 chunks of the sequence in parallel, returning two characters offset by delay
positions.
Returned pairs are (add, remove)
, and the first delay
‘remove’ characters are always 0
.
For example, when the sequence starts as ABCDEF...
, and delay=2
,
the first returned tuples in the first lane are:
(b'A', 0)
, (b'B', 0)
, (b'C', b'A')
, (b'D', b'B')
.
When context>1
, consecutive chunks overlap by context-1
bases:
the first context-1
‘added’ characters of the second chunk overlap
with the last context-1
‘added’ characters of the first chunk.
Sourcefn par_iter_bp_delayed_2(
self,
context: usize,
delay1: Delay,
delay2: Delay,
) -> PaddedIt<impl ChunkIt<(u32x8, u32x8, u32x8)>>
fn par_iter_bp_delayed_2( self, context: usize, delay1: Delay, delay2: Delay, ) -> PaddedIt<impl ChunkIt<(u32x8, u32x8, u32x8)>>
Iterate over 8 chunks of the sequence in parallel, returning three characters:
the char added, the one delay
positions before, and the one delay2
positions before.
Requires delay1 <= delay2
.
Returned pairs are (add, d1, d2)
. The first delay1
d1
characters and first delay2
d2
are always 0
.
For example, when the sequence starts as ABCDEF...
, and delay1=2
and delay2=3
,
the first returned tuples in the first lane are:
(b'A', 0, 0)
, (b'B', 0, 0)
, (b'C', b'A', 0)
, (b'D', b'B', b'A')
.
When context>1
, consecutive chunks overlap by context-1
bases:
the first context-1
‘added’ characters of the second chunk overlap
with the last context-1
‘added’ characters of the first chunk.
Provided Methods§
Sourcefn bits_per_char(&self) -> usize
fn bits_per_char(&self) -> usize
Convenience function that returns b=Self::BITS_PER_CHAR
.
Sourcefn to_word(&self) -> usize
👎Deprecated: Prefer to_u64
.
fn to_word(&self) -> usize
to_u64
.Convert a short sequence (kmer) to a packed representation as usize
.
Sourcefn to_word_revcomp(&self) -> usize
👎Deprecated: Prefer revcomp_to_u64
.
fn to_word_revcomp(&self) -> usize
revcomp_to_u64
.Convert a short sequence (kmer) to a packed representation of its reverse complement as usize
.
Sourcefn read_revcomp_kmer(&self, k: usize, pos: usize) -> u64
fn read_revcomp_kmer(&self, k: usize, pos: usize) -> u64
Extract a reverse complement k-mer from this sequence.
Sourcefn read_kmer_u128(&self, k: usize, pos: usize) -> u128
fn read_kmer_u128(&self, k: usize, pos: usize) -> u128
Extract a k-mer from this sequence.
Sourcefn read_revcomp_kmer_u128(&self, k: usize, pos: usize) -> u128
fn read_revcomp_kmer_u128(&self, k: usize, pos: usize) -> u128
Extract a reverse complement k-mer from this sequence.
Dyn Compatibility§
This trait is not dyn compatible.
In older versions of Rust, dyn compatibility was called "object safety", so this trait is not object safe.
Implementations on Foreign Types§
Source§impl Seq<'_> for &[u8]
Maps ASCII to [0, 4)
on the fly.
Prefer first packing into a PackedSeqVec
for storage.
impl Seq<'_> for &[u8]
Maps ASCII to [0, 4)
on the fly.
Prefer first packing into a PackedSeqVec
for storage.
Source§fn iter_bp(self) -> impl ExactSizeIterator<Item = u8>
fn iter_bp(self) -> impl ExactSizeIterator<Item = u8>
Iter the ASCII characters.
Source§fn par_iter_bp(self, context: usize) -> PaddedIt<impl ChunkIt<u32x8>>
fn par_iter_bp(self, context: usize) -> PaddedIt<impl ChunkIt<u32x8>>
Iter the ASCII characters in parallel.