# Strobemers
A Rust crate to generate strobemers. Strobemers are a type of fuzzy seed originally designed for bioinformatics use-cases to perform well with substitutions and especially insertions/deletions. For more information see the paper: [Kristoffer Sahlin, Effective sequence similarity detection with strobemers, Genome Res. November 2021 31: 2080-2094](https://genome.cshlp.org/content/31/11/2080).
This crate aims to provide a toolkit for reproducing existing strobemer implementations while allowing individual components to be easily swapped out (e.g. hash function, window generator, or strobe selector). The `StrobeHasher` implements the hash function used when generating strobemers, the `WindowGenerator` creates the windows strobes are selected in, and `StrobeSelector` actually selects the strobe within their windows.
Currently the only supported pre-made implementations are intended to generate identical strobemers as the original C++ implementation [here](https://github.com/ksahlin/strobemers/tree/main/strobemers_cpp). The randstrobe is `RandstrobeSahlin2021` and minstrobe is `MinstrobeSahlin2021`.
# Example using RandstrobeSahlin2021
```
use strobemers::StrobemerBuilder;
use strobemers::implementations::RandstrobeSahlin2021;
let reference = b"ACGCGTACGAATCACGCCGGGTGTGTGTGATCG";
let n: usize = 2;
let k: usize = 15;
let w_min: usize = 16;
let w_max: usize = 30;
let mut randstrobe_iter = StrobemerBuilder::from_implementation(RandstrobeSahlin2021)
.reference(reference)
.n(n)
.k(k)
.w_min(w_min)
.build()
.unwrap();
for strobe in randstrobe_iter {
println!("randstrobe start positions: {:?}", strobe);
}
```
# Example starting with RandstrobeSahlin2021 and replacing the hash function
```
use strobemers::StrobemerBuilder;
use strobemers::implementations::{RandstrobeSahlin2021, StrobeHasher};
use wyhash::wyhash;
let reference = b"ACGCGTACGAATCACGCCGGGTGTGTGTGATCG";
let n: usize = 2;
let k: usize = 15;
let w_min: usize = 16;
let w_max: usize = 30;
struct WyHasher;
impl StrobeHasher for WyHasher {
fn hash(&self, input: &[u8], k: usize) -> Vec<u64> {
let mut input_hashes = Vec::new();
for i in 0..input.len() - k {
input_hashes.push(wyhash(&input[i..i + k], 42));
}
input_hashes
}
}
let mut randstrobe_iter = StrobemerBuilder::from_implementation(RandstrobeSahlin2021)
.reference(reference)
.n(n)
.k(k)
.w_min(w_min)
.hasher(Box::new(WyHasher))
.build()
.unwrap();
for strobe in randstrobe_iter {
println!("randstrobe start positions: {:?}", strobe);
}
```