Expand description
§LIBMF Rust
LIBMF - large-scale sparse matrix factorization - for Rust
Check out Disco for higher-level collaborative filtering
§Installation
Add this line to your application’s Cargo.toml
under [dependencies]
:
libmf = "0.3"
§Getting Started
Prep your data in the format row_index, column_index, value
let mut data = libmf::Matrix::new();
data.push(0, 0, 5.0);
data.push(0, 2, 3.5);
data.push(1, 1, 4.0);
Fit a model
let model = libmf::Model::params().fit(&data).unwrap();
Make predictions
model.predict(row_index, column_index);
Get the latent factors (these approximate the training matrix)
model.p(row_index);
model.q(column_index);
// or
model.p_iter();
model.q_iter();
Get the bias (average of all elements in the training matrix)
model.bias();
Save the model to a file
model.save("model.txt").unwrap();
Load a model from a file
let model = libmf::Model::load("model.txt").unwrap();
Pass a validation set
let model = libmf::Model::params().fit_eval(&train_set, &eval_set).unwrap();
§Cross-Validation
Perform cross-validation
let avg_error = libmf::Model::params().cv(&data, 5).unwrap();
§Parameters
Set parameters - default values below
libmf::Model::params()
.loss(libmf::Loss::RealL2) // loss function
.factors(8) // number of latent factors
.threads(12) // number of threads
.bins(25) // number of bins
.iterations(20) // number of iterations
.lambda_p1(0.0) // L1-regularization parameter for P
.lambda_p2(0.1) // L2-regularization parameter for P
.lambda_q1(0.0) // L1-regularization parameter for Q
.lambda_q2(0.1) // L2-regularization parameter for Q
.learning_rate(0.1) // learning rate
.alpha(1.0) // importance of negative entries
.c(0.0001) // desired value of negative entries
.nmf(false) // perform non-negative MF (NMF)
.quiet(false); // no outputs to stdout
§Loss Functions
For real-valued matrix factorization
Loss::RealL2
- squared error (L2-norm)Loss::RealL1
- absolute error (L1-norm)Loss::RealKL
- generalized KL-divergence
For binary matrix factorization
Loss::BinaryLog
- logarithmic errorLoss::BinaryL2
- squared hinge lossLoss::BinaryL1
- hinge loss
For one-class matrix factorization
Loss::OneClassRow
- row-oriented pair-wise logarithmic lossLoss::OneClassCol
- column-oriented pair-wise logarithmic lossLoss::OneClassL2
- squared error (L2-norm)
§Metrics
Calculate RMSE (for real-valued MF)
model.rmse(&data);
Calculate MAE (for real-valued MF)
model.mae(&data);
Calculate generalized KL-divergence (for non-negative real-valued MF)
model.gkl(&data);
Calculate logarithmic loss (for binary MF)
model.logloss(&data);
Calculate accuracy (for binary MF)
model.accuracy(&data);
Calculate MPR (for one-class MF)
model.mpr(&data, transpose);
Calculate AUC (for one-class MF)
model.auc(&data, transpose);
§Example
Download the MovieLens 100K dataset.
Add these lines to your application’s Cargo.toml
under [dependencies]
:
csv = "1"
serde = { version = "1", features = ["derive"] }
And use:
use csv::ReaderBuilder;
use serde::Deserialize;
use std::fs::File;
#[derive(Debug, Deserialize)]
struct Row {
user_id: i32,
item_id: i32,
rating: f32,
time: i32,
}
fn main() {
let mut train_set = libmf::Matrix::new();
let mut valid_set = libmf::Matrix::new();
let file = File::open("u.data").unwrap();
let mut rdr = ReaderBuilder::new()
.has_headers(false)
.delimiter(b'\t')
.from_reader(file);
for (i, record) in rdr.records().enumerate() {
let row: Row = record.unwrap().deserialize(None).unwrap();
let matrix = if i < 80000 { &mut train_set } else { &mut valid_set };
matrix.push(row.user_id, row.item_id, row.rating);
}
let model = libmf::Model::params().fit_eval(&train_set, &valid_set).unwrap();
println!("RMSE: {:?}", model.rmse(&valid_set));
}
§Reference
Specify the initial capacity for a matrix
let mut data = libmf::Matrix::with_capacity(3);
§Resources
§History
View the changelog
§Contributing
Everyone is encouraged to help improve this project. Here are a few ways you can help:
- Report bugs
- Fix bugs and submit pull requests
- Write, clarify, or fix documentation
- Suggest or add new features
To get started with development:
git clone --recursive https://github.com/ankane/libmf-rust.git
cd libmf-rust
cargo test
Structs§
- A matrix.
- A model.
- A set of parameters.
Enums§
- An error.
- Loss functions.