Crate gbwt

Crate gbwt 

Source
Expand description

§GBWT: Graph BWT

This is a Rust reimplementation of parts of the GBWT and the GBWTGraph. It is based on the Simple-SDS library.

§References

§GBWT

Jouni Sirén, Erik Garrison, Adam M. Novak, Benedict Paten, and Richard Durbin: Haplotype-aware graph indexes.
Bioinformatics 36(2):400-407, 2020. DOI: 10.1093/bioinformatics/btz575

§GBWTGraph

Jouni Sirén, Jean Monlong, Xian Chang, Adam M. Novak, Jordan M. Eizenga, Charles Markello, Jonas A. Sibbesen, Glenn Hickey, Pi-Chuan Chang, Andrew Carroll, Namrata Gupta, Stacey Gabriel, Thomas W. Blackwell, Aakrosh Ratan, Kent D. Taylor, Stephen S. Rich, Jerome I. Rotter, David Haussler, Erik Garrison, and Benedict Paten:
Pangenomics enables genotyping of known structural variants in 5202 diverse genomes.
Science 374(6574):abg8871, 2021. DOI: 10.1126/science.abg8871

§GBZ

Jouni Sirén and Benedict Paten: GBZ file format for pangenome graphs.
Bioinformatics 38(22):5012-5018, 2022. DOI: 10.1093/bioinformatics/btac656

§Notes

  • See Simple-SDS for assumptions on the environment.
  • This implementation supports the Simple-SDS file formats for GBWT and GBZ.
  • GBWT / GBZ files written by this library can be identified by source tag value jltsiren/gbwt-rs.

Re-exports§

pub use crate::bwt::Pos;
pub use crate::gbwt::GBWT;
pub use crate::gbwt::SearchState;
pub use crate::gbwt::BidirectionalState;
pub use crate::gbwt::Metadata;
pub use crate::gbwt::PathName;
pub use crate::gbwt::FullPathName;
pub use crate::gbz::GBZ;
pub use crate::graph::Segment;
pub use crate::support::GraphPosition;
pub use crate::support::Orientation;

Modules§

algorithms
Algorithms using GBWT and GBZ.
bwt
The BWT stored as an array of compressed node records.
gbwt
GBWT: A run-length encoded FM-index storing paths as sequences of node identifiers.
gbz
GBZ: Space-efficient representation for a subset of GFA.
graph
GBWTGraph: Node sequences and node-to-segment translation.
headers
File format headers.
support
Support structures for GBWT and GBZ.

Constants§

ENDMARKER
Node identifier 0 is used for technical purposes and does not exist in the graph.
REFERENCE_SAMPLES_KEY
Key for the tag listing the names of reference samples.
REF_SAMPLE
Sample name for generic named paths.
SOURCE_KEY
Key of the source tag.
SOURCE_VALUE
Value of the source tag.