Crate nucs

Expand description

nucs is a library for working with nucleotide and amino acid sequences.

The goal is to supply useful tools for working with DNA/peptides while attempting to integrate with the rest of Rust.

// `Nuc` represents concrete nucleotides, and `Dna` holds `Nuc`s
use nucs::{Dna, Nuc};

// `Dna` can be parsed, modified and displayed.
let mut dna: Dna = "CATG".parse()?;
dna.extend([Nuc::A, Nuc::G]);
assert_eq!(dna.to_string(), "CATGAG");

// For convenience, there's a helper to build const literals:
const CAT: &[Nuc] = &Nuc::lit(b"CAT");
assert!(dna.starts_with(CAT));

// `Seq` is a wrapper to add convenience features to `Vec`-like collections
use nucs::Seq;
// and `Dna` is actually just an alias for `Seq<Vec<Nuc>>`
let dna: Seq<Vec<Nuc>> = dna;

// ...but `Seq` can wrap any sufficiently `Vec`-like collection:
let mut dna = Seq(std::collections::VecDeque::from_iter(dna));
dna[3] = Nuc::T;
dna.push_front(Nuc::A);
// Displayed `Seq`s can be line-wrapped by using alternate formatting:
assert_eq!(format!("{dna:#4}"), "ACAT\nTAG");

// `DnaSlice` supplies helpers for working with slices:
use nucs::DnaSlice;
use Nuc::{A, C, G, T};
let slice = dna.make_contiguous();
assert_eq!(
    slice.reading_frames(),
    [
        &[[A, C, A], [T, T, A]],
        &[[C, A, T], [T, A ,G]],
        &[[A, T, T]],
    ] as [&[_]; 3]
);
slice.revcomp(); // in-place reverse-complement
assert_eq!(dna.to_string(), "CTAATGT");

// `DnaIter` supplies helpers for working with DNA iterators non-destructively:
use nucs::DnaIter;

let iter = dna
    .iter()
    .trimmed_to_codon()
    .revcomped();
// (cloneable) DNA iterators can be displayed too:
let wrapped = format!("{:#3}", iter.display());
assert_eq!(wrapped, "CAT\nTAG");

// Ambiguous nucleotides represent non-empty sets of nucleotides.
use nucs::AmbiNuc;

// `Nuc`s can be composed into `AmbiNuc`s...
assert_eq!(C | A | T, AmbiNuc::H);
// ...which can be decomposed back into `Nuc`s
let dna = AmbiNuc::lit(b"STRAYGYMNAST");
assert!(dna[0].iter().eq([C, G]));
assert!(dna[1].iter().eq([T]));
assert!(dna[8].iter().eq(Nuc::ALL));

// Both concrete and ambiguous amino acids are supported as well:
use nucs::{Amino, AmbiAmino};

let peptide = Seq(Amino::lit(b"KITTY*PAWS"));
assert_eq!(format!("{peptide:#5}"), "KITTY\n*PAWS");

assert_eq!(Amino::I | Amino::L, AmbiAmino::J);
assert!((Amino::C | Amino::A | Amino::T).iter().eq(Amino::lit(b"ACT")));

// And it's easy to translate DNA into peptides:
use nucs::NCBI1;

let dna = Nuc::lit(b"TTTGAGCTCATAAACGAGA");
let peptide: Seq<Vec<_>> = dna.translate(NCBI1).collect();
assert_eq!(peptide.to_string(), "FELINE");

// Even ambiguous DNA can be translated:
let dna = AmbiNuc::lit(b"MTTGCGTCTCCCGAGCGC");
let peptide: Seq<Vec<_>> = dna.translate(NCBI1).collect();
assert_eq!(peptide.to_string(), "JASPER");

§Features

proptest: Enables proptest integration and utils, particularly Arbitrary generation of Nuc, AmbiNuc, Amino and AmbiAmino.
rand: Enables rand integration, particularly StandardUniform generation of Nuc, AmbiNuc, Amino and AmbiAmino.
serde: Enables serde integration for Seq<T>.
unsafe: (experimental) This enables casting between &[Nuc] and &[AmbiNuc].

Re-exports§

pub use iter::DnaIter;
pub use slice::DnaSlice;
pub use translation::NCBI1;

Modules§

error: Error types
iter: Iterator-related types
slice: Slice-related types
translation: Types related to translation of codons into amino acids.

Structs§

AmbiAmino: Ambiguous amino acid
Seq: Provides DNA/peptide ergonomics for collections.

Enums§

AmbiNuc: Ambiguous nucleotide
Amino: Amino acid
Nuc: Concrete nucleotide

Traits§

Nucleotide: A nucleotide; either Nuc or AmbiNuc.
Symbol: A sequence element; either Nuc, AmbiNuc, Amino or AmbiAmino.

Type Aliases§

AmbiDna: Common ambiguous nucleotide sequence type
AmbiPeptide: Common ambiguous amino acid sequence type
Dna: Common nucleotide sequence type
Peptide: Common amino acid sequence type

Crate nucs

Crate nucs Copy item path

§Features

Re-exports§

Modules§

Structs§

Enums§

Traits§

Type Aliases§

Crate nucs