Expand description
Coding/Decoding trait for bit-packable enums representing biological alphabets
The dna, iupac, text, and amino alphabets are built in.
This trait implements the translation between the UTF-8 representation of an alphabet and it’s efficient bit-packing.
The BITS
attribute stores the number of bits used by the representation.
use bio_seq::prelude::{Dna, Codec};
use bio_seq::codec::text;
assert_eq!(Dna::BITS, 2);
assert_eq!(text::Dna::BITS, 8);
§Deriving custom Codecs
Custom encodings can be easily defined on enums using the derivable Codec
trait.
ⓘ
use bio_seq::prelude;
use bio_seq::prelude::Codec;
#[derive(Clone, Copy, Debug, PartialEq, Eq, Hash, Codec)]
pub enum Dna {
A = 0b00,
C = 0b01,
G = 0b10,
T = 0b11,
}
Modules§
- 6-bit representation of amino acids
- 2-bit DNA representation:
A: 00, C: 01, G: 10, T: 11
- 4-bit IUPAC nucleotide ambiguity codes
- 8-bit UTF-8/ASCII representation of nucleotides
Traits§
- The binary encodings of an alphabet’s characters are represented with
u8
s. Encoding from UTF-8 or a rawu8
will always be fallible but often can be assumed safe. - Nucleotide alphabets that can be complemented implement
Complement