Expand description
Builder module for constructing SSHash dictionaries
This module implements the multi-step build pipeline:
- Parse and encode input strings (FASTA/FASTQ)
- Compute minimizer tuples for each k-mer
- Merge and sort minimizer tuples
- Build minimizers control map (MPHF)
- Hash minimizers with MPHF IDs
- Build sparse and skew index
- Finalize dictionary structure
Re-exports§
pub use cf_seg::CfSegData;pub use cf_seg::parse_cf_seg;pub use config::BuildConfiguration;pub use minimizer_tuples::MinimizerTuple;pub use dictionary_builder::DictionaryBuilder;pub use dictionary_builder::BucketMetadata;pub use crate::kmer::Kmer;
Modules§
- buckets
- Bucket classification and statistics
- cf_seg
- Parser for cuttlefish
.cf_segsegment files - config
- Build configuration for SSHash dictionary construction
- dictionary_
builder - Dictionary builder orchestration
- encode
- String encoding module for building the Spectrum-Preserving String Set
- external_
sort - External sorting for minimizer tuples
- minimizer_
tuples - Minimizer tuple computation for the build pipeline
- parse
- FASTA/FASTQ parsing with automatic decompression