Module bio::data_structures[][src]

Various useful data structures.

Modules

annot_map

Efficient container for locations annotated across a set of named reference sequences.

bit_tree

BIT-tree (Binary Indexed Trees, aka Fenwick Tree) maintains a prefix-sum or prefix-max that can be efficiently queried and updated. From: Peter M. Fenwick (1994). “A new data structure for cumulative frequency tables”. Software: Practice and Experience. 24 (3): 327–336. Implementation outlined here: https://www.topcoder.com/community/data-science/data-science-tutorials/binary-indexed-trees/

bitenc

A fixed-width bit encoding implementation. This allows to store a sequence of values over a reduced alphabet by packing them bit-encoded into a sequence of bytes.

bwt

The Burrows-Wheeler-Transform and related data structures. The implementation is based on the lecture notes “Algorithmen auf Sequenzen”, Kopczynski, Marschall, Martin and Rahmann, 2008 - 2015.

fmindex

The Full-text index in Minute space index (FM-index) and the FMD-Index for finding suffix array intervals matching a given pattern in linear time.

interpolation_table

Fast lookup table for arbitrary floating point functions.

interval_tree
qgram_index

A classical, flexible, q-gram index implementation.

rank_select

Rank/Select data structure based on Gonzalez, Grabowski, Mäkinen, Navarro (2005). This implementation uses only a single level of blocks, and performs well for large n.

smallints

A data structure for a sequence of small integers with a few big integers. Small ints are stored in type S (e.g. a byte), big ints are stored separately (in type B) in a BTree. The implementation provides vector-like operations on the data structure (e.g. retrieve a position, add an integer, etc.).

suffix_array

Suffix arrays and related algorithms. The implementation is based on the lecture notes “Algorithmen auf Sequenzen”, Kopczynski, Marschall, Martin and Rahmann, 2008 - 2015. The original algorithm desciption can be found in: Ge Nong, Sen Zhang, Wai Hong Chan: Two Efficient Algorithms for Linear Time Suffix Array Construction. IEEE Trans. Computers 60(10): 1471–1484 (2011)

wavelet_matrix

Wavelet Matrix data structure for DNA alphabet. The implementation is based on the paper Claude Francisco and Gonzalo Navarro. The wavelet matrix. SPIRE (2012)