Skip to main content

Module builder

Module builder 

Source
Expand description

Builder module for constructing SSHash dictionaries

This module implements the multi-step build pipeline:

  1. Parse and encode input strings (FASTA/FASTQ)
  2. Compute minimizer tuples for each k-mer
  3. Merge and sort minimizer tuples
  4. Build minimizers control map (MPHF)
  5. Hash minimizers with MPHF IDs
  6. Build sparse and skew index
  7. Finalize dictionary structure

Re-exports§

pub use cf_seg::CfSegData;
pub use cf_seg::parse_cf_seg;
pub use config::BuildConfiguration;
pub use minimizer_tuples::MinimizerTuple;
pub use dictionary_builder::DictionaryBuilder;
pub use dictionary_builder::BucketMetadata;
pub use crate::kmer::Kmer;

Modules§

buckets
Bucket classification and statistics
cf_seg
Parser for cuttlefish .cf_seg segment files
config
Build configuration for SSHash dictionary construction
dictionary_builder
Dictionary builder orchestration
encode
String encoding module for building the Spectrum-Preserving String Set
external_sort
External sorting for minimizer tuples
minimizer_tuples
Minimizer tuple computation for the build pipeline
parse
FASTA/FASTQ parsing with automatic decompression