Module ml

Source
Expand description

§Machine Learning

The Machine Learning Library (MLL) is a set of classes and functions for statistical classification, regression, and clustering of data.

Most of the classification and regression algorithms are implemented as C++ classes. As the algorithms have different sets of features (like an ability to handle missing measurements or categorical input variables), there is a little common ground between the classes. This common ground is defined by the class cv::ml::StatModel that all the other ML classes are derived from.

See detailed overview here: [ml_intro].

Modules§

prelude

Structs§

ANN_MLP
Artificial Neural Networks - Multi-Layer Perceptrons.
Boost
Boosted tree classifier derived from DTrees
DTrees
The class represents a single decision tree or a collection of decision trees.
DTrees_Node
The class represents a decision tree node.
DTrees_Split
The class represents split in a decision tree.
EM
The class implements the Expectation Maximization algorithm.
KNearest
The class implements K-Nearest Neighbors model
LogisticRegression
Implements Logistic Regression classifier.
NormalBayesClassifier
Bayes classifier for normally distributed data.
ParamGrid
The structure represents the logarithmic grid range of statmodel parameters.
RTrees
The class implements the random forest predictor.
SVM
Support Vector Machines.
SVMSGD
! Stochastic Gradient Descent SVM classifier
SVM_Kernel
StatModel
Base class for statistical models in OpenCV ML.
TrainData
Class encapsulating training data.

Enums§

ANN_MLP_ActivationFunctions
possible activation functions
ANN_MLP_TrainFlags
Train options
ANN_MLP_TrainingMethods
Available training methods
Boost_Types
Boosting type. Gentle AdaBoost and Real AdaBoost are often the preferable choices.
DTrees_Flags
Predict options
EM_Types
Type of covariation matrices
ErrorTypes
%Error types
KNearest_Types
Implementations of KNearest algorithm
LogisticRegression_Methods
Training methods
LogisticRegression_RegKinds
Regularization kinds
SVMSGD_MarginType
Margin type.
SVMSGD_SvmsgdType
SVMSGD type. ASGD is often the preferable choice.
SVM_KernelTypes
%SVM kernel type
SVM_ParamTypes
%SVM params type
SVM_Types
%SVM type
SampleTypes
Sample types
StatModel_Flags
Predict options
VariableTypes
Variable types

Constants§

ANN_MLP_ANNEAL
The simulated annealing algorithm. See Kirkpatrick83 for details.
ANN_MLP_BACKPROP
The back-propagation algorithm.
ANN_MLP_GAUSSIAN
Gaussian function: inline formula
ANN_MLP_IDENTITY
Identity function: inline formula
ANN_MLP_LEAKYRELU
Leaky ReLU function: for x>0 inline formula and x<=0 inline formula
ANN_MLP_NO_INPUT_SCALE
Do not normalize the input vectors. If this flag is not set, the training algorithm normalizes each input feature independently, shifting its mean value to 0 and making the standard deviation equal to 1. If the network is assumed to be updated frequently, the new training data could be much different from original one. In this case, you should take care of proper normalization.
ANN_MLP_NO_OUTPUT_SCALE
Do not normalize the output vectors. If the flag is not set, the training algorithm normalizes each output feature independently, by transforming it to the certain range depending on the used activation function.
ANN_MLP_RELU
ReLU function: inline formula
ANN_MLP_RPROP
The RPROP algorithm. See RPROP93 for details.
ANN_MLP_SIGMOID_SYM
Symmetrical sigmoid: inline formula
ANN_MLP_UPDATE_WEIGHTS
Update the network weights, rather than compute them from scratch. In the latter case the weights are initialized using the Nguyen-Widrow algorithm.
Boost_DISCRETE
Discrete AdaBoost.
Boost_GENTLE
Gentle AdaBoost. It puts less weight on outlier data points and for that reason is often good with regression data.
Boost_LOGIT
LogitBoost. It can produce good regression fits.
Boost_REAL
Real AdaBoost. It is a technique that utilizes confidence-rated predictions and works well with categorical data.
COL_SAMPLE
each training sample occupies a column of samples
DTrees_PREDICT_AUTO
DTrees_PREDICT_MASK
DTrees_PREDICT_MAX_VOTE
DTrees_PREDICT_SUM
EM_COV_MAT_DEFAULT
A symmetric positively defined matrix. The number of free parameters in each matrix is about inline formula. It is not recommended to use this option, unless there is pretty accurate initial estimation of the parameters and/or a huge number of training samples.
EM_COV_MAT_DIAGONAL
A diagonal matrix with positive diagonal elements. The number of free parameters is d for each matrix. This is most commonly used option yielding good estimation results.
EM_COV_MAT_GENERIC
A symmetric positively defined matrix. The number of free parameters in each matrix is about inline formula. It is not recommended to use this option, unless there is pretty accurate initial estimation of the parameters and/or a huge number of training samples.
EM_COV_MAT_SPHERICAL
A scaled identity matrix inline formula. There is the only parameter inline formula to be estimated for each matrix. The option may be used in special cases, when the constraint is relevant, or as a first step in the optimization (for example in case when the data is preprocessed with PCA). The results of such preliminary estimation may be passed again to the optimization procedure, this time with covMatType=EM::COV_MAT_DIAGONAL.
EM_DEFAULT_MAX_ITERS
EM_DEFAULT_NCLUSTERS
EM_START_AUTO_STEP
EM_START_E_STEP
EM_START_M_STEP
KNearest_BRUTE_FORCE
KNearest_KDTREE
LogisticRegression_BATCH
LogisticRegression_MINI_BATCH
Set MiniBatchSize to a positive integer when using this method.
LogisticRegression_REG_DISABLE
Regularization disabled
LogisticRegression_REG_L1
%L1 norm
LogisticRegression_REG_L2
%L2 norm
ROW_SAMPLE
each training sample is a row of samples
SVMSGD_ASGD
Average Stochastic Gradient Descent
SVMSGD_HARD_MARGIN
More accurate for the case of linearly separable sets.
SVMSGD_SGD
Stochastic Gradient Descent
SVMSGD_SOFT_MARGIN
General case, suits to the case of non-linearly separable sets, allows outliers.
SVM_C
SVM_CHI2
Exponential Chi2 kernel, similar to the RBF kernel: inline formula.
SVM_COEF
SVM_CUSTOM
Returned by SVM::getKernelType in case when custom kernel has been set
SVM_C_SVC
C-Support Vector Classification. n-class classification (n inline formula 2), allows imperfect separation of classes with penalty multiplier C for outliers.
SVM_DEGREE
SVM_EPS_SVR
inline formula-Support Vector Regression. The distance between feature vectors from the training set and the fitting hyper-plane must be less than p. For outliers the penalty multiplier C is used.
SVM_GAMMA
SVM_INTER
Histogram intersection kernel. A fast kernel. inline formula.
SVM_LINEAR
Linear kernel. No mapping is done, linear discrimination (or regression) is done in the original feature space. It is the fastest option. inline formula.
SVM_NU
SVM_NU_SVC
inline formula-Support Vector Classification. n-class classification with possible imperfect separation. Parameter inline formula (in the range 0..1, the larger the value, the smoother the decision boundary) is used instead of C.
SVM_NU_SVR
inline formula-Support Vector Regression. inline formula is used instead of p. See LibSVM for details.
SVM_ONE_CLASS
Distribution Estimation (One-class %SVM). All the training data are from the same class, %SVM builds a boundary that separates the class from the rest of the feature space.
SVM_P
SVM_POLY
Polynomial kernel: inline formula.
SVM_RBF
Radial basis function (RBF), a good choice in most cases. inline formula.
SVM_SIGMOID
Sigmoid kernel: inline formula.
StatModel_COMPRESSED_INPUT
StatModel_PREPROCESSED_INPUT
StatModel_RAW_OUTPUT
makes the method return the raw results (the sum), not the class label
StatModel_UPDATE_MODEL
TEST_ERROR
TRAIN_ERROR
VAR_CATEGORICAL
categorical variables
VAR_NUMERICAL
same as VAR_ORDERED
VAR_ORDERED
ordered variables

Traits§

ANN_MLPTrait
Mutable methods for crate::ml::ANN_MLP
ANN_MLPTraitConst
Constant methods for crate::ml::ANN_MLP
BoostTrait
Mutable methods for crate::ml::Boost
BoostTraitConst
Constant methods for crate::ml::Boost
DTreesTrait
Mutable methods for crate::ml::DTrees
DTreesTraitConst
Constant methods for crate::ml::DTrees
DTrees_NodeTrait
Mutable methods for crate::ml::DTrees_Node
DTrees_NodeTraitConst
Constant methods for crate::ml::DTrees_Node
DTrees_SplitTrait
Mutable methods for crate::ml::DTrees_Split
DTrees_SplitTraitConst
Constant methods for crate::ml::DTrees_Split
EMTrait
Mutable methods for crate::ml::EM
EMTraitConst
Constant methods for crate::ml::EM
KNearestTrait
Mutable methods for crate::ml::KNearest
KNearestTraitConst
Constant methods for crate::ml::KNearest
LogisticRegressionTrait
Mutable methods for crate::ml::LogisticRegression
LogisticRegressionTraitConst
Constant methods for crate::ml::LogisticRegression
NormalBayesClassifierTrait
Mutable methods for crate::ml::NormalBayesClassifier
NormalBayesClassifierTraitConst
Constant methods for crate::ml::NormalBayesClassifier
ParamGridTrait
Mutable methods for crate::ml::ParamGrid
ParamGridTraitConst
Constant methods for crate::ml::ParamGrid
RTreesTrait
Mutable methods for crate::ml::RTrees
RTreesTraitConst
Constant methods for crate::ml::RTrees
SVMSGDTrait
Mutable methods for crate::ml::SVMSGD
SVMSGDTraitConst
Constant methods for crate::ml::SVMSGD
SVMTrait
Mutable methods for crate::ml::SVM
SVMTraitConst
Constant methods for crate::ml::SVM
SVM_KernelTrait
Mutable methods for crate::ml::SVM_Kernel
SVM_KernelTraitConst
Constant methods for crate::ml::SVM_Kernel
StatModelTrait
Mutable methods for crate::ml::StatModel
StatModelTraitConst
Constant methods for crate::ml::StatModel
TrainDataTrait
Mutable methods for crate::ml::TrainData
TrainDataTraitConst
Constant methods for crate::ml::TrainData

Functions§

create_concentric_spheres_test_set
Creates test set
rand_mv_normal
Generates sample from multivariate normal distribution

Type Aliases§

ANN_MLP_ANNEAL