strobemers 0.1.1

A toolkit for generating strobemers
Documentation
# Strobemers

A Rust crate to generate strobemers. Strobemers are a type of fuzzy seed originally designed for bioinformatics use-cases to perform well with substitutions and especially insertions/deletions. For more information see the paper: [Kristoffer Sahlin, Effective sequence similarity detection with strobemers, Genome Res. November 2021 31: 2080-2094](https://genome.cshlp.org/content/31/11/2080).

This crate aims to provide a toolkit for reproducing existing strobemer implementations while allowing individual components to be easily swapped out (e.g. hash function, window generator, or strobe selector). The `StrobeHasher` implements the hash function used when generating strobemers, the `WindowGenerator` creates the windows strobes are selected in, and `StrobeSelector` actually selects the strobe within their windows.

Currently the only supported pre-made implementations are intended to generate identical strobemers as the original C++ implementation [here](https://github.com/ksahlin/strobemers/tree/main/strobemers_cpp). The randstrobe is `RandstrobeSahlin2021` and minstrobe is `MinstrobeSahlin2021`.


# Example using RandstrobeSahlin2021

```
use strobemers::StrobemerBuilder;
use strobemers::implementations::RandstrobeSahlin2021;
let reference = b"ACGCGTACGAATCACGCCGGGTGTGTGTGATCG";
let n: usize = 2;
let k: usize = 15;
let w_min: usize = 16;
let w_max: usize = 30;

let mut randstrobe_iter = StrobemerBuilder::from_implementation(RandstrobeSahlin2021)
    .reference(reference)
    .n(n)
    .k(k)
    .w_min(w_min)
    .build()
    .unwrap();
for strobe in randstrobe_iter {
    println!("randstrobe start positions: {:?}", strobe);
}
```


# Example starting with RandstrobeSahlin2021 and replacing the hash function
```
use strobemers::StrobemerBuilder;
use strobemers::implementations::{RandstrobeSahlin2021, StrobeHasher};
use wyhash::wyhash;

let reference = b"ACGCGTACGAATCACGCCGGGTGTGTGTGATCG";
let n: usize = 2;
let k: usize = 15;
let w_min: usize = 16;
let w_max: usize = 30;

struct WyHasher;
impl StrobeHasher for WyHasher {
    fn hash(&self, input: &[u8], k: usize) -> Vec<u64> {
        let mut input_hashes = Vec::new();
        for i in 0..input.len() - k {
            input_hashes.push(wyhash(&input[i..i + k], 42));
        }
        input_hashes
    }
}

let mut randstrobe_iter = StrobemerBuilder::from_implementation(RandstrobeSahlin2021)
    .reference(reference)
    .n(n)
    .k(k)
    .w_min(w_min)
    .hasher(Box::new(WyHasher))
    .build()
    .unwrap();
for strobe in randstrobe_iter {
    println!("randstrobe start positions: {:?}", strobe);
}
```