Module rustyms::align

source ·
Expand description

Only available with feature align. Code to make alignments of two peptides based on mass mistakes, and genetic information.

A mass based alignment handles the case in which multiple amino acids are wrong, but the total mass of this set of amino acids is equal to the mass of a set of different amino acids on the other peptide. This is quite common in mass spectrometry where mistakes based on mass coincidences are very common. For example N has the same mass as GG, so if we want to make a mass spectrometry faithful alignment of ANA with AGGA the result should reflect this fact:

Identity: 0.500 (2/4), Similarity: 0.750 (3/4), Gaps: 0.000 (0/4), Score: 0.706 (12/17),
Equal mass, Tolerance: 10 ppm, Alignment: global
Start: A 0 B 0, Path: 1=1:2i1=

AN·A A
AGGA B
 ╶╴

Generated using this algorithm bound to a cli tool: https://github.com/snijderlab/align-cli

use rustyms::{*, align::*};
let a = LinearPeptide::pro_forma("ANA").unwrap();
let b = LinearPeptide::pro_forma("AGGA").unwrap();
let alignment = align::<4>(&a, &b, &matrix::BLOSUM62,
                   Tolerance::new_ppm(10.0), AlignType::GLOBAL);
assert_eq!(alignment.short(), "1=1:2i1=");
assert_eq!(alignment.ppm(), 0.0);

Modules§

Structs§

  • The type of alignment to perform
  • An alignment of two reads. Which owns the sequences.
  • A piece in an alignment, determining what step was taken in the alignment and how this impacted the score
  • An alignment of two reads. Which has a reference to the sequences.
  • The score of an alignment
  • Statistics for an alignment with some helper functions to easily retrieve the number of interest.

Enums§

  • The type of a single match step
  • The alignment specification for a single side

Traits§

  • A generalised alignment with all behaviour.

Functions§

  • Create an alignment of two peptides based on mass and homology. The substitution matrix is in the exact same order as the definition of AminoAcid. The Tolerance sets the tolerance for two sets of amino acids to be regarded as the same mass. The AlignType controls the alignment behaviour, global/local or anything in between.
  • Only available with if features align and imgt are turned on. Align one sequence to multiple consecutive genes. Each gene can be controlled to be global to the left or free to allow unmatched residues between it and the previous gene. If the sequence is too short to cover all genes only the genes that could be matched are returned.
  • Only available with if features align, rayon, and imgt are turned on. Align one sequence to multiple consecutive genes. Each gene can be controlled to be global to the left or free to allow unmatched residues between it and the previous gene. If the sequence is too short to cover all genes only the genes that could be matched are returned.