csvbinmatrix 0.5.0

Binary matrix Compressed Sparse Vector
Documentation

Binary matrix Compressed Sparse Vector library.

Crates.io MSRV Crate Downloads Coverage report Pipeline status docs.rs License

🦀 Rust package for binary matrices represented in the Compressed Sparse Vector (CSV) format.

BinCSV is the main structure and maintains the CSV format which is suitable for sparse matrices.

Learn more in the References section.

Quick usage

# fn quick_usage() -> Result<(), Box<dyn std::error::Error>> {
use csvbinmatrix::bincsv::BinCSV;

let matrix = BinCSV::from_u8_rows(&vec![
    vec![0, 0, 0],
    vec![0, 0, 1],
    vec![0, 1, 1],
    vec![1, 1, 1],
]);

println!("Number of rows: {}", matrix.number_of_rows());
println!("Number of columns: {}", matrix.number_of_columns());

println!("Number of ones: {}", matrix.number_of_ones());
println!("Number of zeros: {}", matrix.number_of_zeros());

matrix.write_to_file("mymatrix.csvbm")?;
// DOCU write to file but consume
assert_eq!(BinCSV::read_from_file("mymatrix.csvbm")?, matrix);
std::fs::remove_file("mymatrix.csvbm")?;

# Ok(())
# }

You may be interested in the to_file method to write the corresponding CSVBM file. See also the CSVBM file format section.

Submatrices

Let say we want to obtain the submatrix of matrix without the second row and the first column.

# use csvbinmatrix::bincsv::{BinCSV, SubBinCSV};
#
let matrix = BinCSV::from_u8_rows(&vec![
    vec![0, 0, 0],
    vec![0, 0, 1],
    vec![0, 1, 1],
    vec![1, 1, 1],
]);

let expected_sub_matrix = SubBinCSV::Normal(BinCSV::from_u8_rows(&vec![
    vec![0, 0],
    vec![1, 1],
    vec![1, 1],
]));

Easy way, but costly

use csvbinmatrix::bincsv::{BinCSV, SubBinCSV};
use csvbinmatrix::filters::ClosureDimensionFilter;
#
# let matrix = BinCSV::from_u8_rows(&vec![
#     vec![0, 0, 0],
#     vec![0, 0, 1],
#     vec![0, 1, 1],
#     vec![1, 1, 1],
# ]);
# let expected_sub_matrix = BinCSV::from_u8_rows(&vec![vec![0, 0], vec![1, 1], vec![1, 1]]);

let row_filter = ClosureDimensionFilter::new(|i| i != 1);
let column_filter = ClosureDimensionFilter::new(|j| j != 0);

// Generate submatrices according to the couples of row and column filters.
// `matrix` is not consumed.
match matrix.submatrix(&row_filter, &column_filter) {
    SubBinCSV::Normal(sub_matrix) => assert_eq!(sub_matrix, expected_sub_matrix),
    _ => unreachable!("There must be one resulting sub matrix."),
}

Efficiently producing submatrices

Here we show how to obtain an efficient submatrix while consuming the super binary matrix with the to_reversed_sub_binmatrices method. The memory and the time complexities are linear according to the number ones of the super binary matrix. This efficiency is at the cost of reversing the rows and the columns of the submatrix. You can retrieve the row and the column orders with the reverse or to_reversed method.

use csvbinmatrix::bincsv::{BinCSV, SubBinCSV};
use csvbinmatrix::filters::{BoolVecDimensionFilter, ClosureDimensionFilter};
#
# let matrix = BinCSV::from_u8_rows(&vec![
#     vec![0, 0, 0],
#     vec![0, 0, 1],
#     vec![0, 1, 1],
#     vec![1, 1, 1],
# ]);
#
# let expected_sub_matrix = BinCSV::from_u8_rows(&vec![vec![0, 0], vec![1, 1], vec![1, 1]]);

// Filter with a boolean vector:
// * to represent complex truth states
// * costly
let row_filter =
    BoolVecDimensionFilter::new(vec![true, false, true, true], matrix.number_of_rows());

// Filter with a closure
// * to represent simple truth states
// * efficient
let column_filter = ClosureDimensionFilter::new(|j| j != 0);

// Generate submatrices according to the couples of row and column filters.
// `matrix` is consumed.
let mut sub_matrices = matrix.to_reversed_sub_binmatrices(vec![(&row_filter, &column_filter)]);

// The rows and the columns of the submatrices are reversed comparing to the ones of the super matrix.
match sub_matrices.pop() {
    Some(SubBinCSV::Normal(reversed_sub_matrix)) => {
        assert_eq!(reversed_sub_matrix.to_reversed(), expected_sub_matrix)
    }
    _ => unreachable!("There must be one resulting sub matrix."),
}
// We give only one filter couple, so there is only one resulting submatrix.
assert!(sub_matrices.pop().is_none());

// You can drop the row filter and take the ownership of the boolean vectors.
let boolvec_row = row_filter.to_boolean_vector();
println!("Boolean vector for the rows: {boolvec_row:?}");

The CSVBM file format for CSV binary matrices

The CSVBM format is as follows:

number_of_rows number_of_columns number_of_ones
distance
distance
...
distance

In our example matrix, the file is:

4 3 6
5
2
1
1
1
1

Recent Changes

See CHANGELOG.md.

License

Dual-licensed to be compatible with the Rust project.

Licensed under the Apache License, Version 2.0 http://www.apache.org/licenses/LICENSE-2.0 or the MIT license http://opensource.org/licenses/MIT, at your option. This file may not be copied, modified, or distributed except according to those terms.

References

This package is based on this paper, where the authors' method is adapted for binary matrices:

Farzaneh, Aiyoub, Hossein Kheırı, et Mehdi Abbaspour Shahmersı. « AN EFFICIENT STORAGE FORMAT FOR LARGE SPARSE MATRICES ». Communications Faculty of Sciences University of Ankara Series A1 Mathematics and Statistics 58, nᵒ 2 (1 août 2009): 1‑10. https://doi.org/10.1501/Commua1_0000000648.