Support Vector Machines
Support Vector Machines are one major branch of machine learning models and offer classification or regression analysis of labeled datasets. They seek a discriminant, which seperates the data in an optimal way, e.g. have the fewest numbers of miss-classifications and maximizes the margin between positive and negative classes. A support vector contributes to the discriminant and is therefore important for the classification/regression task. The balance between the number of support vectors and model performance can be controlled with hyperparameters.
More details can be found here
For supervised classification tasks the C or Nu values are used to control this balance. In fit_c the C value controls the penalty given to missclassification and should be in the interval (0, inf). In fit_nu the Nu value controls the number of support vectors and should be in the interval (0, 1].
For supervised classification with just one class of data a special classifier is available in fit_one_class. It also accepts a Nu value.
For support vector regression two flavors are available. With fit_epsilon a regression task is learned while minimizing deviation larger than epsilon. In fit_nu the parameter epsilon is replaced with Nu again and should be in the interval (0, 1]
Normally the resulting discriminant is linear, but with Kernel Methods non-linear relations between the input features can be learned in order improve the performance of the model.
For example to transform a dataset into a sparse RBF kernel with 10 non-zero distances you can
use linfa_kernel::Kernel; let dataset = ...; let kernel = Kernel::gaussian_sparse(&dataset, 10);
This implementation uses Sequential Minimal Optimization, a widely used optimization tool for convex problems. It selects in each optimization step two variables and updates the variables. In each step it performs:
- Find a variable, which violates the KKT conditions for the optimization problem
- Pick a second variables and crate a pair (a1, a2)
- Optimize the pair (a1, a2)
After a couple of iterations the solution may be optimal.
The wine quality data consists of 11 features, like "acid", "sugar", "sulfur dioxide", and groups the quality into worst 3 to best 8. These are unified to good 8-7 and bad 3-6 to get a binary classification task.
With an RBF kernel and C-Support Vector Classification an accuracy of 0.988% is reached within 2911 iterations and 1248 support vectors. You can find the example here.
Fit SVM classifier with #1439 training points Exited after 2911 iterations with obj = -248.51510322468084 and 1248 support vectors classes | bad | good bad | 1228 | 17 good | 0 | 194 accuracy 0.98818624, MCC 0.9523008
Support Vector Classification
Support Vector Regression
The result of the SMO solver
SMO can either exit because a threshold is reached or the iterations are maxed out