Struct WaveletMatrix

Source
pub struct WaveletMatrix { /* private fields */ }
Expand description

A wavelet matrix implementation implemented as described in Navarro and Claude, 2021. The implementation is designed to allow for extremely large alphabet sizes, without sacrificing performance for small alphabets.

There are two constructor algorithms available:

  • [from_bit_vec] and [from_slice] construct the wavelet matrix by repeatedly sorting the elements. These constructors have linear space overhead and run in O(kn * log n) time complexity.
  • [from_bit_vec_pc] and [from_slice_pc] construct the wavelet matrix by counting the prefixes of the elements. These constructors have a space complexity of O(2^k) and run in O(kn), which makes this constructor preferable for large sequences over small alphabets.

They encode a sequence of n k-bit words into a wavelet matrix which supports constant-time rank and select queries on elements of its k-bit alphabet. All query functions are mirrored for both BitVec and u64 query elements, so if k <= 64, no heap allocation is needed for the query element.

Other than rank and select queries, the matrix also supports quantile queries (range select i), and range-predecessor and -successor queries, all of which are loosely based on Külekci and Thankachan with better time complexity.

All operations implemented on the matrix are O(k) time complexity. The space complexity of the wavelet matrix is O(n * k) with a small linear overhead (see RsVec).

§Examples

use vers_vecs::{BitVec, WaveletMatrix};

// pack elements from a 3-bit alphabet into a bit vector and construct a wavelet matrix from them
let bit_vec = BitVec::pack_sequence_u8(&[1, 4, 4, 1, 2, 7], 3);
let wavelet_matrix = WaveletMatrix::from_bit_vec(&bit_vec, 3);

// query the wavelet matrix
assert_eq!(wavelet_matrix.get_u64(0), Some(1));
assert_eq!(wavelet_matrix.get_u64(1), Some(4));

assert_eq!(wavelet_matrix.rank_u64(3, 4), Some(2));
assert_eq!(wavelet_matrix.rank_u64(3, 1), Some(1));
assert_eq!(wavelet_matrix.select_u64(0, 7), Some(5));

// statistics
assert_eq!(wavelet_matrix.range_median_u64(0..3), Some(4));
assert_eq!(wavelet_matrix.predecessor_u64(0..6, 3), Some(2));

Implementations§

Source§

impl WaveletMatrix

Source

pub fn from_bit_vec(bit_vec: &BitVec, bits_per_element: u16) -> Self

Create a new wavelet matrix from a sequence of n k-bit words. The constructor runs in O(kn * log n) time complexity.

§Parameters
  • bit_vec: A packed sequence of n k-bit words. The i-th word begins in the bits_per_element * i-th bit of the bit vector. Words are stored from least to most significant bit.
  • bits_per_element: The number k of bits in each word. Cannot exceed 1 << 16.
§Panics

Panics if the number of bits in the bit vector is not a multiple of the number of bits per element.

Source

pub fn from_slice(sequence: &[u64], bits_per_element: u16) -> Self

Create a new wavelet matrix from a sequence of n k-bit words. The constructor runs in O(kn * log n) time complexity.

§Parameters
  • sequence: A slice of n u64 values, each encoding a k-bit word.
  • bits_per_element: The number k of bits in each word. Cannot exceed 64.
§Panics

Panics if the number of bits per element exceeds 64.

Source

pub fn from_bit_vec_pc(bit_vec: &BitVec, bits_per_element: u16) -> Self

Create a new wavelet matrix from a sequence of n k-bit words using the prefix counting algorithm Dinklage et al. The constructor runs in O(kn) time complexity but requires O(2^k) space during construction, so it is only recommended for small alphabets. Use the from_bit_vec or from_slice constructors for larger alphabets.

§Parameters
  • bit_vec: A packed sequence of n k-bit words. The i-th word begins in the bits_per_element * i-th bit of the bit vector. Words are stored from least to most significant bit.
  • bits_per_element: The number k of bits in each word. Cannot exceed 1 << 16.
§Panics

Panics if the number of bits in the bit vector is not a multiple of the number of bits per element, or if the number of bits per element exceeds 64.

Source

pub fn from_slice_pc(sequence: &[u64], bits_per_element: u16) -> Self

Create a new wavelet matrix from a sequence of n k-bit words using the prefix counting algorithm Dinklage et al. The constructor runs in O(kn) time complexity but requires O(2^k) space during construction, so it is only recommended for small alphabets. Use the from_bit_vec or from_slice constructors for larger alphabets.

§Parameters
  • sequence: A slice of n u64 values, each encoding a k-bit word.
  • bits_per_element: The number k of bits in each word. Cannot exceed 64.
§Panics

Panics if the number of bits per element exceeds 64.

Source

pub fn get_value(&self, i: usize) -> Option<BitVec>

Get the i-th element of the encoded sequence in a k-bit word. The k-bit word is returned as a BitVec. The first element of the bit vector is the least significant bit.

Returns None if the index is out of bounds.

§Example
use vers_vecs::{BitVec, WaveletMatrix};

let bit_vec = BitVec::pack_sequence_u8(&[1, 4, 4, 1, 2, 7], 3);
let wavelet_matrix = WaveletMatrix::from_bit_vec(&bit_vec, 3);

assert_eq!(wavelet_matrix.get_value(0), Some(BitVec::pack_sequence_u8(&[1], 3)));
assert_eq!(wavelet_matrix.get_value(1), Some(BitVec::pack_sequence_u8(&[4], 3)));
assert_eq!(wavelet_matrix.get_value(100), None);
Source

pub fn get_value_unchecked(&self, i: usize) -> BitVec

Get the i-th element of the encoded sequence in a k-bit word. The k-bit word is returned as a BitVec. The first element of the bit vector is the least significant bit. This function does not perform bounds checking. Use get_value for a checked version.

§Panics

May panic if the index is out of bounds. May instead return an empty bit vector.

Source

pub fn get_u64(&self, i: usize) -> Option<u64>

Get the i-th element of the encoded sequence as a u64. The u64 is constructed from the k-bit word stored in the wavelet matrix.

Returns None if the index is out of bounds, or if the number of bits per element exceeds 64.

§Example
use vers_vecs::{BitVec, WaveletMatrix};

let bit_vec = BitVec::pack_sequence_u8(&[1, 4, 4, 1, 2, 7], 3);
let wavelet_matrix = WaveletMatrix::from_bit_vec(&bit_vec, 3);

assert_eq!(wavelet_matrix.get_u64(0), Some(1));
assert_eq!(wavelet_matrix.get_u64(1), Some(4));
assert_eq!(wavelet_matrix.get_u64(100), None);
Source

pub fn get_u64_unchecked(&self, i: usize) -> u64

Get the i-th element of the encoded sequence as a u64 numeral. The element is encoded in the lowest k bits of the numeral. If the number of bits per element exceeds 64, the value is truncated. This function does not perform bounds checking. Use get_u64 for a checked version.

§Panic

May panic if the value of i is out of bounds. May instead return 0.

Source

pub fn rank_range_unchecked( &self, range: Range<usize>, symbol: &BitVec, ) -> usize

Get the number of occurrences of the given symbol in the encoded sequence in the range. The symbol is a k-bit word encoded in a BitVec, where the least significant bit is the first element, and k is the number of bits per element in the wavelet matrix.

This method does not perform bounds checking, nor does it check if the symbol is a valid k-bit word. Use rank_range for a checked version.

§Panics

May panic if the range is out of bounds, or if the number of bits in symbol is lower than k. May instead return 0.

Source

pub fn rank_range(&self, range: Range<usize>, symbol: &BitVec) -> Option<usize>

Get the number of occurrences of the given symbol in the encoded sequence in the range. The symbol is a k-bit word encoded in a BitVec, where the least significant bit is the first element, and k is the number of bits per element in the wavelet matrix.

Returns None if the range is out of bounds (greater than the length of the encoded sequence, but since it is exclusive, it may be equal to the length), or if the number of bits in symbol is not equal to k.

§Example
use vers_vecs::{BitVec, WaveletMatrix};

let bit_vec = BitVec::pack_sequence_u8(&[1, 4, 4, 1, 2, 7], 3);
let wavelet_matrix = WaveletMatrix::from_bit_vec(&bit_vec, 3);

assert_eq!(wavelet_matrix.rank_range(0..3, &BitVec::pack_sequence_u8(&[4], 3)), Some(2));
assert_eq!(wavelet_matrix.rank_range(2..4, &BitVec::pack_sequence_u8(&[4], 3)), Some(1));
Source

pub fn rank_range_u64_unchecked( &self, range: Range<usize>, symbol: u64, ) -> usize

Get the number of occurrences of the given symbol in the encoded sequence in the range. The symbol is a k-bit word encoded in a u64 numeral, where k is less than or equal to 64. The interval is half-open, meaning rank_range_u64(0..0, symbol) returns 0.

This method does not perform bounds checking, nor does it check if the elements of the wavelet matrix can be represented in a u64 numeral. Use rank_range_u64 for a checked version.

§Panics

May panic if the range is out of bounds. May instead return 0. If the number of bits in wavelet matrix elements exceed 64, the behavior is platform-dependent.

Source

pub fn rank_range_u64(&self, range: Range<usize>, symbol: u64) -> Option<usize>

Get the number of occurrences of the given symbol in the encoded sequence in the range. The symbol is a k-bit word encoded in a u64 numeral, where k is less than or equal to 64. The interval is half-open, meaning rank_range_u64(0..0, symbol) returns 0.

Returns None if the range is out of bounds (greater than the length of the encoded sequence, but since it is exclusive, it may be equal to the length), or if the number of bits in the wavelet matrix elements exceed 64.

§Example
use vers_vecs::{BitVec, WaveletMatrix};

let bit_vec = BitVec::pack_sequence_u8(&[1, 4, 4, 1, 2, 7], 3);
let wavelet_matrix = WaveletMatrix::from_bit_vec(&bit_vec, 3);

assert_eq!(wavelet_matrix.rank_range_u64(0..3, 4), Some(2));
assert_eq!(wavelet_matrix.rank_range_u64(2..4, 4), Some(1));
Source

pub fn rank_offset_unchecked( &self, offset: usize, i: usize, symbol: &BitVec, ) -> usize

Get the number of occurrences of the given symbol in the encoded sequence between the offset-th and i-th element (exclusive). This is equivalent to rank_range_unchecked(offset..i, symbol). The interval is half-open, meaning rank_offset_unchecked(0, 0, symbol) returns 0.

The symbol is a k-bit word encoded in a BitVec, where the least significant bit is the first element, and k is the number of bits per element in the wavelet matrix.

This method does not perform bounds checking, nor does it check if the symbol is a valid k-bit word. Use rank_offset for a checked version.

§Panics

May panic if offset is out of bounds, or if offset + i is larger than the length of the encoded sequence, or if offset is greater than i, or if the number of bits in symbol is lower than k. May instead return 0.

Source

pub fn rank_offset( &self, offset: usize, i: usize, symbol: &BitVec, ) -> Option<usize>

Get the number of occurrences of the given symbol in the encoded sequence between the offset-th and i-th element (exclusive). This is equivalent to rank_range(offset..i, symbol). The interval is half-open, meaning rank_offset(0, 0, symbol) returns 0, because the interval is empty.

The symbol is a k-bit word encoded in a BitVec, where the least significant bit is the first element, and k is the number of bits per element in the wavelet matrix.

Returns None if offset is out of bounds, or if i is larger than the length of the encoded sequence, or if offset is greater than i, or if the number of bits in symbol is not equal to k. i may equal the length of the encoded sequence, which will return the number of occurrences of the symbol up to the end of the sequence.

§Example
use vers_vecs::{BitVec, WaveletMatrix};

let bit_vec = BitVec::pack_sequence_u8(&[1, 4, 4, 1, 2, 7], 3);
let wavelet_matrix = WaveletMatrix::from_bit_vec(&bit_vec, 3);

assert_eq!(wavelet_matrix.rank_offset(0, 3, &BitVec::pack_sequence_u8(&[4], 3)), Some(2));
assert_eq!(wavelet_matrix.rank_offset(2, 4, &BitVec::pack_sequence_u8(&[4], 3)), Some(1));
Source

pub fn rank_offset_u64_unchecked( &self, offset: usize, i: usize, symbol: u64, ) -> usize

Get the number of occurrences of the given symbol in the encoded sequence between the offset-th and i-th element (exclusive). This is equivalent to rank_range(offset..i, symbol). The interval is half-open, meaning rank_offset(0, 0, symbol) returns 0.

The symbol is a k-bit word encoded in a u64 numeral, where k is less than or equal to 64.

This method does not perform bounds checking, nor does it check if the elements of the wavelet matrix can be represented in a u64 numeral. Use rank_offset_u64 for a checked version.

§Panics

May panic if offset is out of bounds, or if i is larger than the length of the encoded sequence, or if offset is greater than i, or if the number of bits in wavelet matrix elements exceed 64. May instead return 0.

Source

pub fn rank_offset_u64( &self, offset: usize, i: usize, symbol: u64, ) -> Option<usize>

Get the number of occurrences of the given symbol in the encoded sequence between the offset-th and i-th element (exclusive). This is equivalent to rank_range(offset..i, symbol). The interval is half-open, meaning rank_offset(0, 0, symbol) returns 0.

The symbol is a k-bit word encoded in a u64 numeral, where k is less than or equal to 64.

Returns None if offset is out of bounds, or if i is larger than the length of the encoded sequence, or if offset is greater than i, or if the number of bits in the wavelet matrix elements exceed 64. i may equal the length of the encoded sequence, which will return the number of occurrences of the symbol up to the end of the sequence.

§Example
use vers_vecs::{BitVec, WaveletMatrix};

let bit_vec = BitVec::pack_sequence_u8(&[1, 4, 4, 1, 2, 7], 3);
let wavelet_matrix = WaveletMatrix::from_bit_vec(&bit_vec, 3);

assert_eq!(wavelet_matrix.rank_offset_u64(0, 3, 4), Some(2));
assert_eq!(wavelet_matrix.rank_offset_u64(2, 4, 4), Some(1));
Source

pub fn rank_unchecked(&self, i: usize, symbol: &BitVec) -> usize

Get the number of occurrences of the given symbol in the encoded sequence up to the i-th element (exclusive). The symbol is a k-bit word encoded in a BitVec, where the least significant bit is the first element, and k is the number of bits per element in the wavelet matrix.

This method does not perform bounds checking, nor does it check if the symbol is a valid k-bit word. Use rank for a checked version.

§Panics

May panic if i is out of bounds, or if the number of bits in symbol is lower than k. May instead return 0. If the number of bits in symbol exceeds k, the remaining bits are ignored.

Source

pub fn rank(&self, i: usize, symbol: &BitVec) -> Option<usize>

Get the number of occurrences of the given symbol in the encoded sequence up to the i-th element (exclusive). The symbol is a k-bit word encoded in a BitVec, where the least significant bit is the first element, and k is the number of bits per element in the wavelet matrix.

Returns None if i is out of bounds (greater than the length of the encoded sequence, but since it is exclusive, it may be equal to the length), or if the number of bits in symbol is not equal to k.

§Example
use vers_vecs::{BitVec, WaveletMatrix};

let bit_vec = BitVec::pack_sequence_u8(&[1, 4, 4, 1, 2, 7], 3);
let wavelet_matrix = WaveletMatrix::from_bit_vec(&bit_vec, 3);

assert_eq!(wavelet_matrix.rank(3, &BitVec::pack_sequence_u8(&[4], 3)), Some(2));
assert_eq!(wavelet_matrix.rank(3, &BitVec::pack_sequence_u8(&[1], 3)), Some(1));
Source

pub fn rank_u64_unchecked(&self, i: usize, symbol: u64) -> usize

Get the number of occurrences of the given symbol in the encoded sequence up to the i-th element (exclusive). The symbol is a k-bit word encoded in a u64 numeral, where k is less than or equal to 64.

This method does not perform bounds checking, nor does it check if the elements of the wavelet matrix can be represented in a u64 numeral. Use rank_u64 for a checked version.

§Panics

May panic if i is out of bounds, or if the number of bits in wavelet matrix elements exceed 64. May instead return 0.

Source

pub fn rank_u64(&self, i: usize, symbol: u64) -> Option<usize>

Get the number of occurrences of the given symbol in the encoded sequence up to the i-th element (exclusive). The symbol is a k-bit word encoded in a u64 numeral, where k is less than or equal to 64.

Returns None if i is out of bounds (greater than the length of the encoded sequence, but since it is exclusive, it may be equal to the length), or if the number of bits in the wavelet matrix elements exceed 64.

§Example
use vers_vecs::{BitVec, WaveletMatrix};

let bit_vec = BitVec::pack_sequence_u8(&[1, 4, 4, 1, 2, 7], 3);
let wavelet_matrix = WaveletMatrix::from_bit_vec(&bit_vec, 3);

assert_eq!(wavelet_matrix.rank_u64(3, 4), Some(2));
assert_eq!(wavelet_matrix.rank_u64(3, 1), Some(1));
Source

pub fn select_offset_unchecked( &self, offset: usize, rank: usize, symbol: &BitVec, ) -> usize

Get the index of the rank-th occurrence of the given symbol in the encoded sequence, starting from the offset-th element. The symbol is a k-bit word encoded in a BitVec, where the least significant bit is the first element, and k is the number of bits per element in the wavelet matrix.

This method does not perform bounds checking, nor does it check if the symbol is a valid k-bit word. Use select_offset for a checked version.

Returns the index of the rank-th occurrence of the symbol in the encoded sequence, or the length of the encoded sequence if the rank-th occurrence does not exist.

§Panics

May panic if the offset is out of bounds, or if the number of bits in symbol is lower than k. May instead return the length of the encoded sequence.

Source

pub fn select_offset( &self, offset: usize, rank: usize, symbol: &BitVec, ) -> Option<usize>

Get the index of the rank-th occurrence of the given symbol in the encoded sequence, starting from the offset-th element. The symbol is a k-bit word encoded in a BitVec, where the least significant bit is the first element, and k is the number of bits per element in the wavelet matrix.

Returns None if offset is out of bounds, or if the number of bits in symbol is not equal to k, or if the rank-th occurrence of the symbol does not exist.

§Example
use vers_vecs::{BitVec, WaveletMatrix};

let bit_vec = BitVec::pack_sequence_u8(&[1, 4, 4, 1, 2, 7], 3);
let wavelet_matrix = WaveletMatrix::from_bit_vec(&bit_vec, 3);

assert_eq!(wavelet_matrix.select_offset(0, 0, &BitVec::pack_sequence_u8(&[4], 3)), Some(1));
assert_eq!(wavelet_matrix.select_offset(0, 1, &BitVec::pack_sequence_u8(&[4], 3)), Some(2));
assert_eq!(wavelet_matrix.select_offset(2, 0, &BitVec::pack_sequence_u8(&[4], 3)), Some(2));
assert_eq!(wavelet_matrix.select_offset(2, 1, &BitVec::pack_sequence_u8(&[4], 3)), None);
Source

pub fn select_offset_u64_unchecked( &self, offset: usize, rank: usize, symbol: u64, ) -> usize

Get the index of the rank-th occurrence of the given symbol in the encoded sequence, starting from the offset-th element. The symbol is a k-bit word encoded in a u64 numeral, where k is less than or equal to 64.

This method does not perform bounds checking, nor does it check if the elements of the wavelet matrix can be represented in a u64 numeral. Use select_offset_u64 for a checked version.

Returns the index of the rank-th occurrence of the symbol in the encoded sequence, or the length of the encoded sequence if the rank-th occurrence does not exist.

§Panics

May panic if the offset is out of bounds, or if the number of bits in wavelet matrix elements exceed 64. May instead return the length of the encoded sequence.

Source

pub fn select_offset_u64( &self, offset: usize, rank: usize, symbol: u64, ) -> Option<usize>

Get the index of the rank-th occurrence of the given symbol in the encoded sequence, starting from the offset-th element. The symbol is a k-bit word encoded in a u64 numeral, where k is less than or equal to 64.

Returns None if offset is out of bounds, or if the number of bits in the wavelet matrix elements exceed 64, or if the rank-th occurrence of the symbol does not exist.

§Example
use vers_vecs::{BitVec, WaveletMatrix};

let bit_vec = BitVec::pack_sequence_u8(&[1, 4, 4, 1, 2, 7], 3);
let wavelet_matrix = WaveletMatrix::from_bit_vec(&bit_vec, 3);

assert_eq!(wavelet_matrix.select_offset_u64(0, 0, 4), Some(1));
assert_eq!(wavelet_matrix.select_offset_u64(0, 1, 4), Some(2));
assert_eq!(wavelet_matrix.select_offset_u64(2, 0, 4), Some(2));
assert_eq!(wavelet_matrix.select_offset_u64(2, 1, 4), None);
Source

pub fn select_unchecked(&self, rank: usize, symbol: &BitVec) -> usize

Get the index of the rank-th occurrence of the given symbol in the encoded sequence. The symbol is a k-bit word encoded in a BitVec, where the least significant bit is the first element, and k is the number of bits per element in the wavelet matrix.

This method does not perform bounds checking, nor does it check if the symbol is a valid k-bit word. Use select for a checked version.

Returns the index of the rank-th occurrence of the symbol in the encoded sequence, or the length of the encoded sequence if the rank-th occurrence does not exist.

§Panics

May panic if the number of bits in symbol is not equal to k. May instead return the length of the encoded sequence.

Source

pub fn select(&self, rank: usize, symbol: &BitVec) -> Option<usize>

Get the index of the rank-th occurrence of the given symbol in the encoded sequence. The symbol is a k-bit word encoded in a BitVec, where the least significant bit is the first element, and k is the number of bits per element in the wavelet matrix.

Returns None if the number of bits in symbol is not equal to k, or if the rank-th occurrence of the symbol does not exist.

§Example
use vers_vecs::{BitVec, WaveletMatrix};

let bit_vec = BitVec::pack_sequence_u8(&[1, 4, 4, 1, 2, 7], 3);
let wavelet_matrix = WaveletMatrix::from_bit_vec(&bit_vec, 3);

assert_eq!(wavelet_matrix.select(0, &BitVec::pack_sequence_u8(&[4], 3)), Some(1));
assert_eq!(wavelet_matrix.select(1, &BitVec::pack_sequence_u8(&[4], 3)), Some(2));
Source

pub fn select_u64_unchecked(&self, rank: usize, symbol: u64) -> usize

Get the index of the rank-th occurrence of the given symbol in the encoded sequence. The symbol is a k-bit word encoded in a u64 numeral, where k is less than or equal to 64.

This method does not perform bounds checking, nor does it check if the elements of the wavelet matrix can be represented in a u64 numeral. Use select_u64 for a checked version.

Returns the index of the rank-th occurrence of the symbol in the encoded sequence, or the length of the encoded sequence if the rank-th occurrence does not exist.

§Panics

May panic if the number of bits in wavelet matrix elements exceed 64. May instead return the length of the encoded sequence.

Source

pub fn select_u64(&self, rank: usize, symbol: u64) -> Option<usize>

Get the index of the rank-th occurrence of the given symbol in the encoded sequence. The symbol is a k-bit word encoded in a u64 numeral, where k is less than or equal to 64.

Returns None if the number of bits in the wavelet matrix elements exceed 64, or if the rank-th occurrence of the symbol does not exist.

§Example
use vers_vecs::{BitVec, WaveletMatrix};

let bit_vec = BitVec::pack_sequence_u8(&[1, 4, 4, 1, 2, 7], 3);
let wavelet_matrix = WaveletMatrix::from_bit_vec(&bit_vec, 3);

assert_eq!(wavelet_matrix.select_u64(0, 4), Some(1));
assert_eq!(wavelet_matrix.select_u64(1, 4), Some(2));
Source

pub fn quantile_unchecked(&self, range: Range<usize>, k: usize) -> BitVec

Get the k-th smallest element in the encoded sequence in the specified range, where k = 0 returns the smallest element. The range is a half-open interval, meaning that the end index is exclusive. The k-th smallest element is returned as a BitVec, where the least significant bit is the first element.

This method does not perform bounds checking. It returns a nonsensical result if the k is greater than the size of the range. Use quantile for a checked version.

§Panics

May panic if the range is out of bounds. May instead return an empty bit vector.

Source

pub fn quantile(&self, range: Range<usize>, k: usize) -> Option<BitVec>

Get the k-th smallest element in the encoded sequence in the specified range, where k = 0 returns the smallest element. The range is a half-open interval, meaning that the end index is exclusive. The k-th smallest element is returned as a BitVec, where the least significant bit is the first element.

Returns None if the range is out of bounds, or if k is greater than the size of the range.

§Example
use vers_vecs::{BitVec, WaveletMatrix};

let bit_vec = BitVec::pack_sequence_u8(&[1, 4, 4, 1, 2, 7], 3);
let wavelet_matrix = WaveletMatrix::from_bit_vec(&bit_vec, 3);

assert_eq!(wavelet_matrix.quantile(0..3, 0), Some(BitVec::pack_sequence_u8(&[1], 3)));
assert_eq!(wavelet_matrix.quantile(0..3, 1), Some(BitVec::pack_sequence_u8(&[4], 3)));
assert_eq!(wavelet_matrix.quantile(1..4, 0), Some(BitVec::pack_sequence_u8(&[1], 3)));
Source

pub fn get_sorted_unchecked(&self, i: usize) -> BitVec

Get the i-th smallest element in the entire wavelet matrix. The i-th smallest element is returned as a BitVec, where the least significant bit is the first element.

This method does not perform bounds checking. Use [get_sorted] for a checked version.

§Panics

May panic if the i is out of bounds, or returns an empty bit vector.

Source

pub fn get_sorted(&self, i: usize) -> Option<BitVec>

Get the i-th smallest element in the wavelet matrix. The i-th smallest element is returned as a BitVec, where the least significant bit is the first element. This method call is equivalent to self.quantile(0..self.len(), i).

Returns None if the i is out of bounds.

§Example
use vers_vecs::{BitVec, WaveletMatrix};

let bit_vec = BitVec::pack_sequence_u8(&[1, 4, 4, 1, 2, 7], 3);
let wavelet_matrix = WaveletMatrix::from_bit_vec(&bit_vec, 3);

assert_eq!(wavelet_matrix.get_sorted(0), Some(BitVec::pack_sequence_u8(&[1], 3)));
assert_eq!(wavelet_matrix.get_sorted(1), Some(BitVec::pack_sequence_u8(&[1], 3)));
assert_eq!(wavelet_matrix.get_sorted(2), Some(BitVec::pack_sequence_u8(&[2], 3)));
Source

pub fn quantile_u64_unchecked(&self, range: Range<usize>, k: usize) -> u64

Get the k-th smallest element in the encoded sequence in the specified range, where k = 0 returns the smallest element. The range is a half-open interval, meaning that the end index is exclusive. The k-th smallest element is returned as a u64 numeral. If the number of bits per element exceeds 64, the value is truncated.

This method does not perform bounds checking. It returns a nonsensical result if the k is greater than the size of the range. Use quantile_u64 for a checked version.

§Panics

May panic if the range is out of bounds. May instead return 0.

Source

pub fn quantile_u64(&self, range: Range<usize>, k: usize) -> Option<u64>

Get the k-th smallest element in the encoded sequence in the specified range, where k = 0 returns the smallest element. The range is a half-open interval, meaning that the end index is exclusive. The k-th smallest element is returned as a u64 numeral.

Returns None if the range is out of bounds, or if the number of bits per element exceeds 64, or if k is greater than the size of the range.

§Example
use vers_vecs::{BitVec, WaveletMatrix};

let bit_vec = BitVec::pack_sequence_u8(&[1, 4, 4, 1, 2, 7], 3);
let wavelet_matrix = WaveletMatrix::from_bit_vec(&bit_vec, 3);

assert_eq!(wavelet_matrix.quantile_u64(0..3, 0), Some(1));
assert_eq!(wavelet_matrix.quantile_u64(0..3, 1), Some(4));
assert_eq!(wavelet_matrix.quantile_u64(1..4, 0), Some(1));
Source

pub fn get_sorted_u64_unchecked(&self, i: usize) -> u64

Get the i-th smallest element in the wavelet matrix. The i-th smallest element is returned as a u64 numeral.

If the number of bits per element exceeds 64, the value is truncated.

This method does not perform bounds checking. Use get_sorted_u64 for a checked version.

§Panics

May panic if the i is out of bounds, or returns an empty bit vector.

Source

pub fn get_sorted_u64(&self, i: usize) -> Option<u64>

Get the i-th smallest element in the entire wavelet matrix. The i-th smallest element is returned as a u64 numeral.

Returns None if the i is out of bounds, or if the number of bits per element exceeds 64.

§Example
use vers_vecs::{BitVec, WaveletMatrix};

let bit_vec = BitVec::pack_sequence_u8(&[1, 4, 4, 1, 2, 7], 3);
let wavelet_matrix = WaveletMatrix::from_bit_vec(&bit_vec, 3);

assert_eq!(wavelet_matrix.get_sorted_u64(0), Some(1));
assert_eq!(wavelet_matrix.get_sorted_u64(1), Some(1));
assert_eq!(wavelet_matrix.get_sorted_u64(2), Some(2));
Source

pub fn range_min_unchecked(&self, range: Range<usize>) -> BitVec

Get the smallest element in the encoded sequence in the specified range. The range is a half-open interval, meaning that the end index is exclusive. The smallest element is returned as a BitVec,

This method does not perform bounds checking. Use range_min for a checked version.

§Panics

May panic if the range is out of bounds or if the range is empty. May instead return an empty bit vector.

Source

pub fn range_min(&self, range: Range<usize>) -> Option<BitVec>

Get the smallest element in the encoded sequence in the specified range. The range is a half-open interval, meaning that the end index is exclusive. The smallest element is returned as a BitVec,

Returns None if the range is out of bounds or if the range is empty.

§Example
use vers_vecs::{BitVec, WaveletMatrix};

let bit_vec = BitVec::pack_sequence_u8(&[1, 4, 4, 1, 2, 7], 3);
let wavelet_matrix = WaveletMatrix::from_bit_vec(&bit_vec, 3);

assert_eq!(wavelet_matrix.range_min(0..3), Some(BitVec::pack_sequence_u8(&[1], 3)));
assert_eq!(wavelet_matrix.range_min(1..4), Some(BitVec::pack_sequence_u8(&[1], 3)));
assert_eq!(wavelet_matrix.range_min(1..3), Some(BitVec::pack_sequence_u8(&[4], 3)));
Source

pub fn range_min_u64_unchecked(&self, range: Range<usize>) -> u64

Get the smallest element in the encoded sequence in the specified range. The range is a half-open interval, meaning that the end index is exclusive. The smallest element is returned as a u64 numeral. If the number of bits per element exceeds 64, the value is truncated.

This method does not perform bounds checking. Use range_min_u64 for a checked version.

§Panics

May panic if the range is out of bounds or if the range is empty. May instead return 0.

Source

pub fn range_min_u64(&self, range: Range<usize>) -> Option<u64>

Get the smallest element in the encoded sequence in the specified range. The range is a half-open interval, meaning that the end index is exclusive. The smallest element is returned as a u64 numeral.

Returns None if the range is out of bounds, if the range is empty, or if the number of bits per element exceeds 64.

§Example
use vers_vecs::{BitVec, WaveletMatrix};

let bit_vec = BitVec::pack_sequence_u8(&[1, 4, 4, 1, 2, 7], 3);
let wavelet_matrix = WaveletMatrix::from_bit_vec(&bit_vec, 3);

assert_eq!(wavelet_matrix.range_min_u64(0..3), Some(1));
assert_eq!(wavelet_matrix.range_min_u64(1..4), Some(1));
assert_eq!(wavelet_matrix.range_min_u64(1..3), Some(4));
Source

pub fn range_max_unchecked(&self, range: Range<usize>) -> BitVec

Get the largest element in the encoded sequence in the specified range. The range is a half-open interval, meaning that the end index is exclusive. The largest element is returned as a BitVec, where the least significant bit is the first element.

This method does not perform bounds checking. Use range_max for a checked version.

§Panics

May panic if the range is out of bounds or if the range is empty. May instead return an empty bit vector.

Source

pub fn range_max(&self, range: Range<usize>) -> Option<BitVec>

Get the largest element in the encoded sequence in the specified range. The range is a half-open interval, meaning that the end index is exclusive. The largest element is returned as a BitVec, where the least significant bit is the first element.

Returns None if the range is out of bounds or if the range is empty.

§Example
use vers_vecs::{BitVec, WaveletMatrix};

let bit_vec = BitVec::pack_sequence_u8(&[1, 4, 4, 1, 2, 7], 3);
let wavelet_matrix = WaveletMatrix::from_bit_vec(&bit_vec, 3);

assert_eq!(wavelet_matrix.range_max(0..3), Some(BitVec::pack_sequence_u8(&[4], 3)));
assert_eq!(wavelet_matrix.range_max(3..6), Some(BitVec::pack_sequence_u8(&[7], 3)));
Source

pub fn range_max_u64_unchecked(&self, range: Range<usize>) -> u64

Get the largest element in the encoded sequence in the specified range. The range is a half-open interval, meaning that the end index is exclusive. The largest element is returned as a u64 numeral. If the number of bits per element exceeds 64, the value is truncated.

This method does not perform bounds checking. Use range_max_u64 for a checked version.

§Panics

May panic if the range is out of bounds or if the range is empty. May instead return 0.

Source

pub fn range_max_u64(&self, range: Range<usize>) -> Option<u64>

Get the largest element in the encoded sequence in the specified range. The range is a half-open interval, meaning that the end index is exclusive. The largest element is returned as a u64 numeral.

Returns None if the range is out of bounds, if the range is empty, or if the number of bits per element exceeds 64.

§Example
use vers_vecs::{BitVec, WaveletMatrix};

let bit_vec = BitVec::pack_sequence_u8(&[1, 4, 4, 1, 2, 7], 3);
let wavelet_matrix = WaveletMatrix::from_bit_vec(&bit_vec, 3);

assert_eq!(wavelet_matrix.range_max_u64(0..3), Some(4));
assert_eq!(wavelet_matrix.range_max_u64(3..6), Some(7));
Source

pub fn range_median_unchecked(&self, range: Range<usize>) -> BitVec

Get the median element in the encoded sequence in the specified range. The range is a half-open interval, meaning that the end index is exclusive. The median element is returned as a BitVec, where the least significant bit is the first element.

If the range does not contain an odd number of elements, the position is rounded down.

This method does not perform bounds checking. Use range_median for a checked version.

§Panics

May panic if the range is out of bounds or if the range is empty. May instead return an empty bit vector.

Source

pub fn range_median(&self, range: Range<usize>) -> Option<BitVec>

Get the median element in the encoded sequence in the specified range. The range is a half-open interval, meaning that the end index is exclusive. The median element is returned as a BitVec, where the least significant bit is the first element.

If the range does not contain an odd number of elements, the position is rounded down.

Returns None if the range is out of bounds or if the range is empty.

§Example
use vers_vecs::{BitVec, WaveletMatrix};

let bit_vec = BitVec::pack_sequence_u8(&[1, 4, 4, 1, 2, 7], 3);
let wavelet_matrix = WaveletMatrix::from_bit_vec(&bit_vec, 3);

assert_eq!(wavelet_matrix.range_median(0..3), Some(BitVec::pack_sequence_u8(&[4], 3)));
assert_eq!(wavelet_matrix.range_median(1..4), Some(BitVec::pack_sequence_u8(&[4], 3)));
assert_eq!(wavelet_matrix.range_median(0..6), Some(BitVec::pack_sequence_u8(&[2], 3)));
Source

pub fn range_median_u64_unchecked(&self, range: Range<usize>) -> u64

Get the median element in the encoded sequence in the specified range. The range is a half-open interval, meaning that the end index is exclusive. The median element is returned as a u64 numeral. If the number of bits per element exceeds 64, the value is truncated.

If the range does not contain an odd number of elements, the position is rounded down.

This method does not perform bounds checking. Use range_median_u64 for a checked version.

§Panics

May panic if the range is out of bounds or if the range is empty. May instead return 0.

Source

pub fn range_median_u64(&self, range: Range<usize>) -> Option<u64>

Get the median element in the encoded sequence in the specified range. The range is a half-open interval, meaning that the end index is exclusive. The median element is returned as a u64 numeral.

If the range does not contain an odd number of elements, the position is rounded down.

Returns None if the range is out of bounds, if the range is empty, or if the number of bits per element exceeds 64.

§Example
use vers_vecs::{BitVec, WaveletMatrix};

let bit_vec = BitVec::pack_sequence_u8(&[1, 4, 4, 1, 2, 7], 3);
let wavelet_matrix = WaveletMatrix::from_bit_vec(&bit_vec, 3);

assert_eq!(wavelet_matrix.range_median_u64(0..3), Some(4));
assert_eq!(wavelet_matrix.range_median_u64(1..4), Some(4));
assert_eq!(wavelet_matrix.range_median_u64(0..6), Some(2));
Source

pub fn predecessor( &self, range: Range<usize>, symbol: &BitVec, ) -> Option<BitVec>

Get the predecessor of the given symbol in the given range. The symbol is a k-bit word encoded in a BitVec, where the least significant bit is the first element, and k is the number of bits per element in the wavelet matrix. The symbol does not have to be in the encoded sequence. The predecessor is the largest element in the range that is smaller than or equal to the symbol.

Returns None if the number of bits in the symbol is not equal to k, if the range is empty, if the wavelet matrix is empty, if the range is out of bounds, or if the symbol is smaller than all elements in the range.

§Example
use vers_vecs::{BitVec, WaveletMatrix};

let bit_vec = BitVec::pack_sequence_u8(&[1, 4, 4, 1, 2, 7], 3);
let wavelet_matrix = WaveletMatrix::from_bit_vec(&bit_vec, 3);

assert_eq!(wavelet_matrix.predecessor(0..3, &BitVec::pack_sequence_u8(&[7], 3)), Some(BitVec::pack_sequence_u8(&[4], 3)));
assert_eq!(wavelet_matrix.predecessor(0..3, &BitVec::pack_sequence_u8(&[4], 3)), Some(BitVec::pack_sequence_u8(&[4], 3)));
assert_eq!(wavelet_matrix.predecessor(0..6, &BitVec::pack_sequence_u8(&[7], 3)), Some(BitVec::pack_sequence_u8(&[7], 3)));
Source

pub fn predecessor_u64(&self, range: Range<usize>, symbol: u64) -> Option<u64>

Get the predecessor of the given symbol in the given range. The symbol is a k-bit word encoded in a u64 numeral, where k is less than or equal to 64. The symbol does not have to be in the encoded sequence. The predecessor is the largest element in the range that is smaller than or equal to the symbol.

Returns None if the number of bits in the matrix is greater than 64, if the range is empty, if the wavelet matrix is empty, if the range is out of bounds, or if the symbol is smaller than all elements in the range.

§Example
use vers_vecs::{BitVec, WaveletMatrix};

let bit_vec = BitVec::pack_sequence_u8(&[1, 4, 4, 1, 2, 7], 3);
let wavelet_matrix = WaveletMatrix::from_bit_vec(&bit_vec, 3);

assert_eq!(wavelet_matrix.predecessor_u64(0..3, 7), Some(4));
assert_eq!(wavelet_matrix.predecessor_u64(0..3, 4), Some(4));
assert_eq!(wavelet_matrix.predecessor_u64(0..6, 7), Some(7));
Source

pub fn successor(&self, range: Range<usize>, symbol: &BitVec) -> Option<BitVec>

Get the successor of the given symbol in the given range. The symbol is a k-bit word encoded in a BitVec, where the least significant bit is the first element, and k is the number of bits per element in the wavelet matrix. The symbol does not have to be in the encoded sequence. The successor is the smallest element in the range that is greater than or equal to the symbol.

Returns None if the number of bits in the symbol is not equal to k, if the range is empty, if the wavelet matrix is empty, if the range is out of bounds, or if the symbol is greater than all elements in the range.

§Example
use vers_vecs::{BitVec, WaveletMatrix};

let bit_vec = BitVec::pack_sequence_u8(&[1, 4, 4, 1, 2, 7], 3);
let wavelet_matrix = WaveletMatrix::from_bit_vec(&bit_vec, 3);

assert_eq!(wavelet_matrix.successor(0..3, &BitVec::pack_sequence_u8(&[2], 3)), Some(BitVec::pack_sequence_u8(&[4], 3)));
assert_eq!(wavelet_matrix.successor(0..3, &BitVec::pack_sequence_u8(&[5], 3)), None);
assert_eq!(wavelet_matrix.successor(0..6, &BitVec::pack_sequence_u8(&[2], 3)), Some(BitVec::pack_sequence_u8(&[2], 3)));
Source

pub fn successor_u64(&self, range: Range<usize>, symbol: u64) -> Option<u64>

Get the successor of the given symbol in the given range. The symbol is a k-bit word encoded in a u64 numeral, where k is less than or equal to 64. The symbol does not have to be in the encoded sequence. The successor is the smallest element in the range that is greater than or equal to the symbol.

Returns None if the number of bits in the matrix is greater than 64, if the range is empty, if the wavelet matrix is empty, if the range is out of bounds, or if the symbol is greater than all elements in the range.

§Example
use vers_vecs::{BitVec, WaveletMatrix};

let bit_vec = BitVec::pack_sequence_u8(&[1, 4, 4, 1, 2, 7], 3);
let wavelet_matrix = WaveletMatrix::from_bit_vec(&bit_vec, 3);

assert_eq!(wavelet_matrix.successor_u64(0..3, 2), Some(4));
assert_eq!(wavelet_matrix.successor_u64(0..3, 5), None);
assert_eq!(wavelet_matrix.successor_u64(0..6, 2), Some(2));
Source

pub fn iter_u64(&self) -> Option<WaveletNumRefIter<'_>>

Get an iterator over the elements of the encoded sequence. The iterator yields u64 elements. If the number of bits per element exceeds 64, None is returned.

§Example
use vers_vecs::{BitVec, WaveletMatrix};

let bit_vec = BitVec::pack_sequence_u8(&[1, 4, 4, 1, 2, 7], 3);
let wavelet_matrix = WaveletMatrix::from_bit_vec(&bit_vec, 3);

let mut iter = wavelet_matrix.iter_u64().unwrap();
assert_eq!(iter.collect::<Vec<_>>(), vec![1, 4, 4, 1, 2, 7]);
Source

pub fn into_iter_u64(self) -> Option<WaveletNumIter>

Turn the encoded sequence into an iterator. The iterator yields u64 elements. If the number of bits per element exceeds 64, None is returned.

Source

pub fn iter_sorted(&self) -> WaveletSortedRefIter<'_>

Get an iterator over the sorted elements of the encoded sequence. The iterator yields BitVec elements.

See also [iter_sorted_u64] for an iterator that yields u64 elements.

Source

pub fn into_iter_sorted(self) -> WaveletSortedIter

Turn the encoded sequence into an iterator over the sorted sequence. The iterator yields BitVec elements.

Source

pub fn iter_sorted_u64(&self) -> Option<WaveletSortedNumRefIter<'_>>

Get an iterator over the sorted elements of the encoded sequence. The iterator yields u64 elements. If the number of bits per element exceeds 64, None is returned.

§Example
use vers_vecs::{BitVec, WaveletMatrix};

let bit_vec = BitVec::pack_sequence_u8(&[1, 4, 4, 1, 2, 7], 3);
let wavelet_matrix = WaveletMatrix::from_bit_vec(&bit_vec, 3);

let mut iter = wavelet_matrix.iter_sorted_u64().unwrap();
assert_eq!(iter.collect::<Vec<_>>(), vec![1, 1, 2, 4, 4, 7]);
Source

pub fn into_iter_sorted_u64(self) -> Option<WaveletSortedNumIter>

Turn the encoded sequence into an iterator over the sorted sequence. The iterator yields u64 elements. If the number of bits per element exceeds 64, None is returned.

Source

pub fn bits_per_element(&self) -> usize

Get the number of bits per element in the alphabet of the encoded sequence.

Source

pub fn bit_len(&self) -> u16

👎Deprecated since 1.5.1: please use bits_per_element instead

Get the number of bits per element in the alphabet of the encoded sequence.

Source

pub fn len(&self) -> usize

Get the number of elements stored in the encoded sequence.

Source

pub fn is_empty(&self) -> bool

Check if the wavelet matrix is empty.

Source

pub fn heap_size(&self) -> usize

Get the number of bytes allocated on the heap for the wavelet matrix. This does not include memory that is allocated but unused due to allocation policies of internal data structures.

Source§

impl WaveletMatrix

Source

pub fn iter(&self) -> WaveletRefIter<'_>

Returns an iterator over the elements of WaveletMatrix. The iterator returns BitVec elements.

Trait Implementations§

Source§

impl Clone for WaveletMatrix

Source§

fn clone(&self) -> WaveletMatrix

Returns a copy of the value. Read more
1.0.0 · Source§

fn clone_from(&mut self, source: &Self)

Performs copy-assignment from source. Read more
Source§

impl Debug for WaveletMatrix

Source§

fn fmt(&self, f: &mut Formatter<'_>) -> Result

Formats the value using the given formatter. Read more
Source§

impl<'de> Deserialize<'de> for WaveletMatrix

Source§

fn deserialize<__D>(__deserializer: __D) -> Result<Self, __D::Error>
where __D: Deserializer<'de>,

Deserialize this value from the given Serde deserializer. Read more
Source§

impl<'a> IntoIterator for &'a WaveletMatrix

Source§

type Item = BitVec

The type of the elements being iterated over.
Source§

type IntoIter = WaveletRefIter<'a>

Which kind of iterator are we turning this into?
Source§

fn into_iter(self) -> Self::IntoIter

Creates an iterator from a value. Read more
Source§

impl<'a> IntoIterator for &'a mut WaveletMatrix

Source§

type Item = BitVec

The type of the elements being iterated over.
Source§

type IntoIter = WaveletRefIter<'a>

Which kind of iterator are we turning this into?
Source§

fn into_iter(self) -> Self::IntoIter

Creates an iterator from a value. Read more
Source§

impl IntoIterator for WaveletMatrix

Source§

type Item = BitVec

The type of the elements being iterated over.
Source§

type IntoIter = WaveletIter

Which kind of iterator are we turning this into?
Source§

fn into_iter(self) -> Self::IntoIter

Creates an iterator from a value. Read more
Source§

impl Serialize for WaveletMatrix

Source§

fn serialize<__S>(&self, __serializer: __S) -> Result<__S::Ok, __S::Error>
where __S: Serializer,

Serialize this value into the given Serde serializer. Read more

Auto Trait Implementations§

Blanket Implementations§

Source§

impl<T> Any for T
where T: 'static + ?Sized,

Source§

fn type_id(&self) -> TypeId

Gets the TypeId of self. Read more
Source§

impl<T> Borrow<T> for T
where T: ?Sized,

Source§

fn borrow(&self) -> &T

Immutably borrows from an owned value. Read more
Source§

impl<T> BorrowMut<T> for T
where T: ?Sized,

Source§

fn borrow_mut(&mut self) -> &mut T

Mutably borrows from an owned value. Read more
Source§

impl<T> CloneToUninit for T
where T: Clone,

Source§

unsafe fn clone_to_uninit(&self, dest: *mut u8)

🔬This is a nightly-only experimental API. (clone_to_uninit)
Performs copy-assignment from self to dest. Read more
Source§

impl<T> From<T> for T

Source§

fn from(t: T) -> T

Returns the argument unchanged.

Source§

impl<T, U> Into<U> for T
where U: From<T>,

Source§

fn into(self) -> U

Calls U::from(self).

That is, this conversion is whatever the implementation of From<T> for U chooses to do.

Source§

impl<T> ToOwned for T
where T: Clone,

Source§

type Owned = T

The resulting type after obtaining ownership.
Source§

fn to_owned(&self) -> T

Creates owned data from borrowed data, usually by cloning. Read more
Source§

fn clone_into(&self, target: &mut T)

Uses borrowed data to replace owned data, usually by cloning. Read more
Source§

impl<T, U> TryFrom<U> for T
where U: Into<T>,

Source§

type Error = Infallible

The type returned in the event of a conversion error.
Source§

fn try_from(value: U) -> Result<T, <T as TryFrom<U>>::Error>

Performs the conversion.
Source§

impl<T, U> TryInto<U> for T
where U: TryFrom<T>,

Source§

type Error = <U as TryFrom<T>>::Error

The type returned in the event of a conversion error.
Source§

fn try_into(self) -> Result<U, <U as TryFrom<T>>::Error>

Performs the conversion.
Source§

impl<T> DeserializeOwned for T
where T: for<'de> Deserialize<'de>,