Module ml

Expand description

§Machine Learning

The Machine Learning Library (MLL) is a set of classes and functions for statistical classification, regression, and clustering of data.

Most of the classification and regression algorithms are implemented as C++ classes. As the algorithms have different sets of features (like an ability to handle missing measurements or categorical input variables), there is a little common ground between the classes. This common ground is defined by the class cv::ml::StatModel that all the other ML classes are derived from.

See detailed overview here: [ml_intro].

Modules§

prelude

Structs§

ANN_MLP: Artificial Neural Networks - Multi-Layer Perceptrons.
Boost: Boosted tree classifier derived from DTrees
DTrees: The class represents a single decision tree or a collection of decision trees.
DTrees_Node: The class represents a decision tree node.
DTrees_Split: The class represents split in a decision tree.
EM: The class implements the Expectation Maximization algorithm.
KNearest: The class implements K-Nearest Neighbors model
LogisticRegression: Implements Logistic Regression classifier.
NormalBayesClassifier: Bayes classifier for normally distributed data.
ParamGrid: The structure represents the logarithmic grid range of statmodel parameters.
RTrees: The class implements the random forest predictor.
SVM: Support Vector Machines.
SVMSGD: ! Stochastic Gradient Descent SVM classifier
SVM_Kernel
StatModel: Base class for statistical models in OpenCV ML.
TrainData: Class encapsulating training data.

Enums§

ANN_MLP_ActivationFunctions: possible activation functions
ANN_MLP_TrainFlags: Train options
ANN_MLP_TrainingMethods: Available training methods
Boost_Types: Boosting type. Gentle AdaBoost and Real AdaBoost are often the preferable choices.
DTrees_Flags: Predict options
EM_Types: Type of covariation matrices
ErrorTypes: %Error types
KNearest_Types: Implementations of KNearest algorithm
LogisticRegression_Methods: Training methods
LogisticRegression_RegKinds: Regularization kinds
SVMSGD_MarginType: Margin type.
SVMSGD_SvmsgdType: SVMSGD type. ASGD is often the preferable choice.
SVM_KernelTypes: %SVM kernel type
SVM_ParamTypes: %SVM params type
SVM_Types: %SVM type
SampleTypes: Sample types
StatModel_Flags: Predict options
VariableTypes: Variable types

Constants§

ANN_MLP_ANNEAL: The simulated annealing algorithm. See Kirkpatrick83 for details.
ANN_MLP_BACKPROP: The back-propagation algorithm.
ANN_MLP_GAUSSIAN: Gaussian function: inline formula
ANN_MLP_IDENTITY: Identity function: inline formula
ANN_MLP_LEAKYRELU: Leaky ReLU function: for x>0 inline formula and x<=0 inline formula
ANN_MLP_NO_INPUT_SCALE: Do not normalize the input vectors. If this flag is not set, the training algorithm normalizes each input feature independently, shifting its mean value to 0 and making the standard deviation equal to 1. If the network is assumed to be updated frequently, the new training data could be much different from original one. In this case, you should take care of proper normalization.
ANN_MLP_NO_OUTPUT_SCALE: Do not normalize the output vectors. If the flag is not set, the training algorithm normalizes each output feature independently, by transforming it to the certain range depending on the used activation function.
ANN_MLP_RELU: ReLU function: inline formula
ANN_MLP_RPROP: The RPROP algorithm. See RPROP93 for details.
ANN_MLP_SIGMOID_SYM: Symmetrical sigmoid: inline formula
ANN_MLP_UPDATE_WEIGHTS: Update the network weights, rather than compute them from scratch. In the latter case the weights are initialized using the Nguyen-Widrow algorithm.
Boost_DISCRETE: Discrete AdaBoost.
Boost_GENTLE: Gentle AdaBoost. It puts less weight on outlier data points and for that reason is often good with regression data.
Boost_LOGIT: LogitBoost. It can produce good regression fits.
Boost_REAL: Real AdaBoost. It is a technique that utilizes confidence-rated predictions and works well with categorical data.
COL_SAMPLE: each training sample occupies a column of samples
DTrees_PREDICT_AUTO
DTrees_PREDICT_MASK
DTrees_PREDICT_MAX_VOTE
DTrees_PREDICT_SUM
EM_COV_MAT_DEFAULT: A symmetric positively defined matrix. The number of free parameters in each matrix is about inline formula. It is not recommended to use this option, unless there is pretty accurate initial estimation of the parameters and/or a huge number of training samples.
EM_COV_MAT_DIAGONAL: A diagonal matrix with positive diagonal elements. The number of free parameters is d for each matrix. This is most commonly used option yielding good estimation results.
EM_COV_MAT_GENERIC: A symmetric positively defined matrix. The number of free parameters in each matrix is about inline formula. It is not recommended to use this option, unless there is pretty accurate initial estimation of the parameters and/or a huge number of training samples.
EM_COV_MAT_SPHERICAL: A scaled identity matrix inline formula. There is the only parameter inline formula to be estimated for each matrix. The option may be used in special cases, when the constraint is relevant, or as a first step in the optimization (for example in case when the data is preprocessed with PCA). The results of such preliminary estimation may be passed again to the optimization procedure, this time with covMatType=EM::COV_MAT_DIAGONAL.
EM_DEFAULT_MAX_ITERS
EM_DEFAULT_NCLUSTERS
EM_START_AUTO_STEP
EM_START_E_STEP
EM_START_M_STEP
KNearest_BRUTE_FORCE
KNearest_KDTREE
LogisticRegression_BATCH
LogisticRegression_MINI_BATCH: Set MiniBatchSize to a positive integer when using this method.
LogisticRegression_REG_DISABLE: Regularization disabled
LogisticRegression_REG_L1: %L1 norm
LogisticRegression_REG_L2: %L2 norm
ROW_SAMPLE: each training sample is a row of samples
SVMSGD_ASGD: Average Stochastic Gradient Descent
SVMSGD_HARD_MARGIN: More accurate for the case of linearly separable sets.
SVMSGD_SGD: Stochastic Gradient Descent
SVMSGD_SOFT_MARGIN: General case, suits to the case of non-linearly separable sets, allows outliers.
SVM_C
SVM_CHI2: Exponential Chi2 kernel, similar to the RBF kernel: inline formula.
SVM_COEF
SVM_CUSTOM: Returned by SVM::getKernelType in case when custom kernel has been set
SVM_C_SVC: C-Support Vector Classification. n-class classification (n inline formula 2), allows imperfect separation of classes with penalty multiplier C for outliers.
SVM_DEGREE
SVM_EPS_SVR: inline formula-Support Vector Regression. The distance between feature vectors from the training set and the fitting hyper-plane must be less than p. For outliers the penalty multiplier C is used.
SVM_GAMMA
SVM_INTER: Histogram intersection kernel. A fast kernel. inline formula.
SVM_LINEAR: Linear kernel. No mapping is done, linear discrimination (or regression) is done in the original feature space. It is the fastest option. inline formula.
SVM_NU
SVM_NU_SVC: inline formula-Support Vector Classification. n-class classification with possible imperfect separation. Parameter inline formula (in the range 0..1, the larger the value, the smoother the decision boundary) is used instead of C.
SVM_NU_SVR: inline formula-Support Vector Regression. inline formula is used instead of p. See LibSVM for details.
SVM_ONE_CLASS: Distribution Estimation (One-class %SVM). All the training data are from the same class, %SVM builds a boundary that separates the class from the rest of the feature space.
SVM_P
SVM_POLY: Polynomial kernel: inline formula.
SVM_RBF: Radial basis function (RBF), a good choice in most cases. inline formula.
SVM_SIGMOID: Sigmoid kernel: inline formula.
StatModel_COMPRESSED_INPUT
StatModel_PREPROCESSED_INPUT
StatModel_RAW_OUTPUT: makes the method return the raw results (the sum), not the class label
StatModel_UPDATE_MODEL
TEST_ERROR
TRAIN_ERROR
VAR_CATEGORICAL: categorical variables
VAR_NUMERICAL: same as VAR_ORDERED
VAR_ORDERED: ordered variables

Traits§

ANN_MLPTrait: Mutable methods for crate::ml::ANN_MLP
ANN_MLPTraitConst: Constant methods for crate::ml::ANN_MLP
BoostTrait: Mutable methods for crate::ml::Boost
BoostTraitConst: Constant methods for crate::ml::Boost
DTreesTrait: Mutable methods for crate::ml::DTrees
DTreesTraitConst: Constant methods for crate::ml::DTrees
DTrees_NodeTrait: Mutable methods for crate::ml::DTrees_Node
DTrees_NodeTraitConst: Constant methods for crate::ml::DTrees_Node
DTrees_SplitTrait: Mutable methods for crate::ml::DTrees_Split
DTrees_SplitTraitConst: Constant methods for crate::ml::DTrees_Split
EMTrait: Mutable methods for crate::ml::EM
EMTraitConst: Constant methods for crate::ml::EM
KNearestTrait: Mutable methods for crate::ml::KNearest
KNearestTraitConst: Constant methods for crate::ml::KNearest
LogisticRegressionTrait: Mutable methods for crate::ml::LogisticRegression
LogisticRegressionTraitConst: Constant methods for crate::ml::LogisticRegression
NormalBayesClassifierTrait: Mutable methods for crate::ml::NormalBayesClassifier
NormalBayesClassifierTraitConst: Constant methods for crate::ml::NormalBayesClassifier
ParamGridTrait: Mutable methods for crate::ml::ParamGrid
ParamGridTraitConst: Constant methods for crate::ml::ParamGrid
RTreesTrait: Mutable methods for crate::ml::RTrees
RTreesTraitConst: Constant methods for crate::ml::RTrees
SVMSGDTrait: Mutable methods for crate::ml::SVMSGD
SVMSGDTraitConst: Constant methods for crate::ml::SVMSGD
SVMTrait: Mutable methods for crate::ml::SVM
SVMTraitConst: Constant methods for crate::ml::SVM
SVM_KernelTrait: Mutable methods for crate::ml::SVM_Kernel
SVM_KernelTraitConst: Constant methods for crate::ml::SVM_Kernel
StatModelTrait: Mutable methods for crate::ml::StatModel
StatModelTraitConst: Constant methods for crate::ml::StatModel
TrainDataTrait: Mutable methods for crate::ml::TrainData
TrainDataTraitConst: Constant methods for crate::ml::TrainData

Functions§

create_concentric_spheres_test_set: Creates test set
rand_mv_normal: Generates sample from multivariate normal distribution

Type Aliases§

ANN_MLP_ANNEAL

Module mlCopy item path