Skip to main content

Crate sshash_lib

Crate sshash_lib 

Source
Expand description

§sshash-lib

Core library for SSHash-rs: a compressed k-mer dictionary based on sparse and skew hashing.

§Quick Start

use sshash_lib::{Dictionary, Kmer, KmerBits};

type Kmer31 = Kmer<31>;

fn main() -> Result<(), Box<dyn std::error::Error>> {
    // Load a previously built index
    let dict = Dictionary::load("index")?;

    // Single k-mer lookup (returns position or INVALID_UINT64)
    let kmer = Kmer31::from_string("ACGTACGTACGTACGTACGTACGTACGTACG")?;
    let pos = dict.lookup::<31>(&kmer);

    // Streaming queries over a sequence
    let mut engine = dict.create_streaming_query::<31>();

    Ok(())
}

§Modules

ModulePurpose
dictionaryLoad, save, lookup, and query k-mers
builderIndex construction pipeline
streaming_queryEfficient sequential k-mer processing
kmerKmer<K> with const-generic sizing and bit-parallel ops
minimizerMinimizer extraction and iteration
minimizers_control_mapMPHF-based minimizer→bucket mapping
spectrum_preserving_string_setSPSS: unitig storage and position lookup
sparse_and_skew_indexBucket dispatch (singleton / light / heavy)
offsetsElias-Fano encoded string boundary offsets

§License

MIT

Re-exports§

pub use kmer::Kmer;
pub use kmer::Kmer21;
pub use kmer::Kmer31;
pub use kmer::Kmer63;
pub use kmer::KmerBits;
pub use minimizer::MinimizerInfo;
pub use minimizer::MinimizerIterator;
pub use minimizers_control_map::MinimizersControlMap;
pub use minimizers_control_map::MinimizersControlMapBuilder;
pub use minimizers_control_map::BucketType;
pub use streaming_query::LookupResult;
pub use streaming_query::StreamingQuery;
pub use dictionary::Dictionary;
pub use builder::BuildConfiguration;
pub use builder::CfSegData;
pub use builder::DictionaryBuilder;
pub use builder::parse_cf_seg;
pub use partitioned_mphf::PartitionedMphf;

Modules§

builder
Builder module for constructing SSHash dictionaries
constants
Constants and configuration for SSHash
dictionary
Dictionary - the main SSHash data structure
encoding
DNA nucleotide encoding
hasher
Deterministic hasher for minimizers using rapidhash.
kmer
K-mer representation with const generics and optimal storage
minimizer
Minimizer extraction and iteration
minimizers_control_map
Minimizers Control Map (MCM)
mphf_config
MPHF (Minimal Perfect Hash Function) type configuration
offsets
Compact encoding of offsets into a bit-packed string set
partitioned_mphf
Partitioned Minimal Perfect Hash Function
serialization
Serialization and deserialization support for Dictionary
sparse_and_skew_index
Sparse and Skew Index for k-mer lookup
spectrum_preserving_string_set
Spectrum-Preserving String Set (SPSS)
streaming_query
Streaming query for efficient k-mer lookups

Macros§

dispatch_on_k
Dispatch to the correct const generic K based on a runtime k value.

Functions§

version
Version information