Genomic data tokenizers and pre-processors to prepare interval data for machine learning pipelines.
The tokenizers module is the most comprehensive module in gtars. It houses all tokenizers that implement
tokenization of genomic data into a known vocabulary. This is especially useful for genomic data machine
learning models that are based on NLP-models like tranformers.
Example
use Path;
use Tokenizer;
use Region;
let tokenizer = from_bed.unwrap;
let regions = vec!;
let tokens = tokenizer.tokenize;