Module bio::data_structures
source · Expand description
Various useful data structures.
Modules
Efficient container for locations annotated across a set of named
reference sequences.
BIT-tree (Binary Indexed Trees, aka Fenwick Tree) maintains a prefix-sum or
prefix-max that can be efficiently queried and updated. From: Peter M. Fenwick (1994). “A new data structure for cumulative frequency tables”. Software: Practice and Experience. 24 (3): 327–336.
Implementation outlined here: https://www.topcoder.com/community/data-science/data-science-tutorials/binary-indexed-trees/
A fixed-width bit encoding implementation. This allows to store a sequence of values over
a reduced alphabet by packing them bit-encoded into a sequence of bytes.
The Burrows-Wheeler-Transform and related data structures.
The implementation is based on the lecture notes
“Algorithmen auf Sequenzen”, Kopczynski, Marschall, Martin and Rahmann, 2008 - 2015.
FM-Index and FMD-Index for finding suffix array intervals matching a given pattern in linear time.
Interval tree, a data structure for efficiently storing and searching intervals.
A classical, flexible, q-gram index implementation.
Rank/Select data structure based on Gonzalez, Grabowski, Mäkinen, Navarro (2005).
This implementation uses only a single level of blocks, and performs well for large n.
A data structure for a sequence of small integers with a few big integers.
Small ints are stored in type S (e.g. a byte), big ints are stored separately (in type B) in a BTree.
The implementation provides vector-like operations on the data structure (e.g. retrieve a position,
add an integer, etc.).
Suffix arrays and related algorithms.
The implementation is based on the lecture notes
“Algorithmen auf Sequenzen”, Kopczynski, Marschall, Martin and Rahmann, 2008 - 2015.