Function nucleic_acid::suffix_array [] [src]

pub fn suffix_array<T>(input: &[T]) -> Vec<u32> where
    T: Num + NumCast + PartialOrd + Copy

Generates a suffix array using the "induced sorting" method.

Building the suffix array is expensive, since the output is a vector of positions. For the human genome (~3 GB), it consumed ~27 GB of memory. That said, the induced sorting method is very fast! It took only ~20 mins to generate the suffix array.

let text = b"Hello, world!";
let sa = nucleic_acid::suffix_array(text as &[u8]);

let mut rotations = (0..text.len()).map(|i| &text[i..]).collect::<Vec<_>>();
rotations.sort();   // sort the prefixes

// skip the first value (which is an empty prefix)
assert_eq!(sa.into_iter().skip(1).map(|i| &text[i as usize..]).collect::<Vec<_>>(),
           rotations);