Crate bio

source ·
Expand description

Rust-bio, a bioinformatics library for Rust.

This library provides implementations of many algorithms and data structures that are useful for bioinformatics. All provided implementations are rigorously tested via continuous integration. For installation instructions and a general overview, visit

Currently, rust-bio provides

  • most major pattern matching algorithms,
  • a convenient alphabet implementation,
  • pairwise alignment,
  • suffix arrays,
  • BWT and FM-Index,
  • FMD-Index for finding supermaximal exact matches,
  • a q-gram index,
  • an orf research algorithm,
  • a rank/select data structure,
  • FASTQ and FASTA and BED readers and writers,
  • helper functions for combinatorics and dealing with log probabilities,
  • an implementation of Hidden Markov Model and related algorithms.


use bio::alphabets;
use bio::data_structures::suffix_array::suffix_array;
use bio::data_structures::bwt::{bwt, less, Occ};
use bio::data_structures::fmindex::{FMIndex, FMIndexable};

let pattern = b"ACCG";

// Create an FM-Index for a given text.
let alphabet = alphabets::dna::iupac_alphabet();
let sa = suffix_array(text);
let bwt = bwt(text, &sa);
let less = less(&bwt, &alphabet);
let occ = Occ::new(&bwt, 3, &alphabet);
let fmindex = FMIndex::new(&bwt, &less, &occ);

let interval = fmindex.backward_search(pattern.iter());
let positions = interval.occ(&sa);

Multithreaded Example

use bio::alphabets;
use bio::data_structures::suffix_array::suffix_array;
use bio::data_structures::bwt::{bwt, less, Occ};
use bio::data_structures::fmindex::{FMIndex, FMIndexable};
use std::sync::Arc;
use std::thread;

let patterns = vec![b"ACCG", b"TGCT"];

// Create an FM-Index for a given text.
let alphabet = alphabets::dna::iupac_alphabet();
let sa = suffix_array(text);
let bwt = Arc::new(bwt(text, &sa));
let less = Arc::new(less(bwt.as_ref(), &alphabet));
let occ = Arc::new(Occ::new(bwt.as_ref(), 3, &alphabet));
let fmindex = Arc::new(FMIndex::new(bwt, less, occ));

// Spawn threads to perform backward searches for each interval
let interval_calculators = patterns.into_iter().map(|pattern| {
    let fmindex = fmindex.clone();
    thread::spawn(move ||

// Loop through the results, extracting the positions array for each pattern
for interval_calculator in interval_calculators {
    let positions = interval_calculator.join().unwrap().occ(&sa);

Documentation and further examples for each module can be found in the module descriptions below.


Various alignment and distance computing algorithms.
Implementation of alphabets and useful utilities.
Various useful data structures.
Readers and writers for common bioinformatics file formats.
This module contains various useful pattern matching algorithms. The implementations are based on the lecture notes “Algorithmen auf Sequenzen”, Kopczynski, Marschall, Martin and Rahmann, 2008 - 2015.
Sequence analysis algorithms.
Mathematical and statistical tools.
Common utilities.