pub struct TrainData { /* private fields */ }
Expand description
Class encapsulating training data.
Please note that the class only specifies the interface of training data, but not implementation. All the statistical model classes in ml module accepts Ptr<TrainData> as parameter. In other words, you can create your own class derived from TrainData and pass smart pointer to the instance of this class into StatModel::train.
§See also
[ml_intro_data]
Implementations§
Source§impl TrainData
impl TrainData
pub fn missing_value() -> Result<f32>
Sourcepub fn get_sub_vector(
vec: &impl MatTraitConst,
idx: &impl MatTraitConst,
) -> Result<Mat>
pub fn get_sub_vector( vec: &impl MatTraitConst, idx: &impl MatTraitConst, ) -> Result<Mat>
Extract from 1D vector elements specified by passed indexes.
§Parameters
- vec: input vector (supported types: CV_32S, CV_32F, CV_64F)
- idx: 1D index vector
Sourcepub fn get_sub_matrix(
matrix: &impl MatTraitConst,
idx: &impl MatTraitConst,
layout: i32,
) -> Result<Mat>
pub fn get_sub_matrix( matrix: &impl MatTraitConst, idx: &impl MatTraitConst, layout: i32, ) -> Result<Mat>
Extract from matrix rows/cols specified by passed indexes.
§Parameters
- matrix: input matrix (supported types: CV_32S, CV_32F, CV_64F)
- idx: 1D index vector
- layout: specifies to extract rows (cv::ml::ROW_SAMPLES) or to extract columns (cv::ml::COL_SAMPLES)
Sourcepub fn load_from_csv(
filename: &str,
header_line_count: i32,
response_start_idx: i32,
response_end_idx: i32,
var_type_spec: &str,
delimiter: char,
missch: char,
) -> Result<Ptr<TrainData>>
pub fn load_from_csv( filename: &str, header_line_count: i32, response_start_idx: i32, response_end_idx: i32, var_type_spec: &str, delimiter: char, missch: char, ) -> Result<Ptr<TrainData>>
Reads the dataset from a .csv file and returns the ready-to-use training data.
§Parameters
- filename: The input file name
- headerLineCount: The number of lines in the beginning to skip; besides the header, the
function also skips empty lines and lines staring with
#
- responseStartIdx: Index of the first output variable. If -1, the function considers the last variable as the response
- responseEndIdx: Index of the last output variable + 1. If -1, then there is single response variable at responseStartIdx.
- varTypeSpec: The optional text string that specifies the variables’ types. It has the
format
ord[n1-n2,n3,n4-n5,...]cat[n6,n7-n8,...]
. That is, variables fromn1 to n2
(inclusive range),n3
,n4 to n5
… are considered ordered andn6
,n7 to n8
… are considered as categorical. The range[n1..n2] + [n3] + [n4..n5] + ... + [n6] + [n7..n8]
should cover all the variables. If varTypeSpec is not specified, then algorithm uses the following rules:- all input variables are considered ordered by default. If some column contains has non- numerical values, e.g. ‘apple’, ‘pear’, ‘apple’, ‘apple’, ‘mango’, the corresponding variable is considered categorical.
- if there are several output variables, they are all considered as ordered. Error is reported when non-numerical values are used.
- if there is a single output variable, then if its values are non-numerical or are all integers, then it’s considered categorical. Otherwise, it’s considered ordered.
- delimiter: The character used to separate values in each line.
- missch: The character used to specify missing measurements. It should not be a digit. Although it’s a non-numerical value, it surely does not affect the decision of whether the variable ordered or categorical.
Note: If the dataset only contains input variables and no responses, use responseStartIdx = -2 and responseEndIdx = 0. The output variables vector will just contain zeros.
§C++ default parameters
- response_start_idx: -1
- response_end_idx: -1
- var_type_spec: String()
- delimiter: ‘,’
- missch: ‘?’
Sourcepub fn load_from_csv_def(
filename: &str,
header_line_count: i32,
) -> Result<Ptr<TrainData>>
pub fn load_from_csv_def( filename: &str, header_line_count: i32, ) -> Result<Ptr<TrainData>>
Reads the dataset from a .csv file and returns the ready-to-use training data.
§Parameters
- filename: The input file name
- headerLineCount: The number of lines in the beginning to skip; besides the header, the
function also skips empty lines and lines staring with
#
- responseStartIdx: Index of the first output variable. If -1, the function considers the last variable as the response
- responseEndIdx: Index of the last output variable + 1. If -1, then there is single response variable at responseStartIdx.
- varTypeSpec: The optional text string that specifies the variables’ types. It has the
format
ord[n1-n2,n3,n4-n5,...]cat[n6,n7-n8,...]
. That is, variables fromn1 to n2
(inclusive range),n3
,n4 to n5
… are considered ordered andn6
,n7 to n8
… are considered as categorical. The range[n1..n2] + [n3] + [n4..n5] + ... + [n6] + [n7..n8]
should cover all the variables. If varTypeSpec is not specified, then algorithm uses the following rules:- all input variables are considered ordered by default. If some column contains has non- numerical values, e.g. ‘apple’, ‘pear’, ‘apple’, ‘apple’, ‘mango’, the corresponding variable is considered categorical.
- if there are several output variables, they are all considered as ordered. Error is reported when non-numerical values are used.
- if there is a single output variable, then if its values are non-numerical or are all integers, then it’s considered categorical. Otherwise, it’s considered ordered.
- delimiter: The character used to separate values in each line.
- missch: The character used to specify missing measurements. It should not be a digit. Although it’s a non-numerical value, it surely does not affect the decision of whether the variable ordered or categorical.
Note: If the dataset only contains input variables and no responses, use responseStartIdx = -2 and responseEndIdx = 0. The output variables vector will just contain zeros.
§Note
This alternative version of TrainData::load_from_csv function uses the following default values for its arguments:
- response_start_idx: -1
- response_end_idx: -1
- var_type_spec: String()
- delimiter: ‘,’
- missch: ‘?’
Sourcepub fn create(
samples: &impl ToInputArray,
layout: i32,
responses: &impl ToInputArray,
var_idx: &impl ToInputArray,
sample_idx: &impl ToInputArray,
sample_weights: &impl ToInputArray,
var_type: &impl ToInputArray,
) -> Result<Ptr<TrainData>>
pub fn create( samples: &impl ToInputArray, layout: i32, responses: &impl ToInputArray, var_idx: &impl ToInputArray, sample_idx: &impl ToInputArray, sample_weights: &impl ToInputArray, var_type: &impl ToInputArray, ) -> Result<Ptr<TrainData>>
Creates training data from in-memory arrays.
§Parameters
- samples: matrix of samples. It should have CV_32F type.
- layout: see ml::SampleTypes.
- responses: matrix of responses. If the responses are scalar, they should be stored as a single row or as a single column. The matrix should have type CV_32F or CV_32S (in the former case the responses are considered as ordered by default; in the latter case - as categorical)
- varIdx: vector specifying which variables to use for training. It can be an integer vector (CV_32S) containing 0-based variable indices or byte vector (CV_8U) containing a mask of active variables.
- sampleIdx: vector specifying which samples to use for training. It can be an integer vector (CV_32S) containing 0-based sample indices or byte vector (CV_8U) containing a mask of training samples.
- sampleWeights: optional vector with weights for each sample. It should have CV_32F type.
- varType: optional vector of type CV_8U and size
<number_of_variables_in_samples> + <number_of_variables_in_responses>
, containing types of each input and output variable. See ml::VariableTypes.
§C++ default parameters
- var_idx: noArray()
- sample_idx: noArray()
- sample_weights: noArray()
- var_type: noArray()
Sourcepub fn create_def(
samples: &impl ToInputArray,
layout: i32,
responses: &impl ToInputArray,
) -> Result<Ptr<TrainData>>
pub fn create_def( samples: &impl ToInputArray, layout: i32, responses: &impl ToInputArray, ) -> Result<Ptr<TrainData>>
Creates training data from in-memory arrays.
§Parameters
- samples: matrix of samples. It should have CV_32F type.
- layout: see ml::SampleTypes.
- responses: matrix of responses. If the responses are scalar, they should be stored as a single row or as a single column. The matrix should have type CV_32F or CV_32S (in the former case the responses are considered as ordered by default; in the latter case - as categorical)
- varIdx: vector specifying which variables to use for training. It can be an integer vector (CV_32S) containing 0-based variable indices or byte vector (CV_8U) containing a mask of active variables.
- sampleIdx: vector specifying which samples to use for training. It can be an integer vector (CV_32S) containing 0-based sample indices or byte vector (CV_8U) containing a mask of training samples.
- sampleWeights: optional vector with weights for each sample. It should have CV_32F type.
- varType: optional vector of type CV_8U and size
<number_of_variables_in_samples> + <number_of_variables_in_responses>
, containing types of each input and output variable. See ml::VariableTypes.
§Note
This alternative version of TrainData::create function uses the following default values for its arguments:
- var_idx: noArray()
- sample_idx: noArray()
- sample_weights: noArray()
- var_type: noArray()
Trait Implementations§
Source§impl Boxed for TrainData
impl Boxed for TrainData
Source§unsafe fn from_raw(ptr: <TrainData as OpenCVFromExtern>::ExternReceive) -> Self
unsafe fn from_raw(ptr: <TrainData as OpenCVFromExtern>::ExternReceive) -> Self
Source§fn into_raw(self) -> <TrainData as OpenCVTypeExternContainer>::ExternSendMut
fn into_raw(self) -> <TrainData as OpenCVTypeExternContainer>::ExternSendMut
Source§fn as_raw(&self) -> <TrainData as OpenCVTypeExternContainer>::ExternSend
fn as_raw(&self) -> <TrainData as OpenCVTypeExternContainer>::ExternSend
Source§fn as_raw_mut(
&mut self,
) -> <TrainData as OpenCVTypeExternContainer>::ExternSendMut
fn as_raw_mut( &mut self, ) -> <TrainData as OpenCVTypeExternContainer>::ExternSendMut
Source§impl TrainDataTrait for TrainData
impl TrainDataTrait for TrainData
fn as_raw_mut_TrainData(&mut self) -> *mut c_void
Source§fn set_train_test_split(&mut self, count: i32, shuffle: bool) -> Result<()>
fn set_train_test_split(&mut self, count: i32, shuffle: bool) -> Result<()>
Source§fn set_train_test_split_def(&mut self, count: i32) -> Result<()>
fn set_train_test_split_def(&mut self, count: i32) -> Result<()>
Source§fn set_train_test_split_ratio(
&mut self,
ratio: f64,
shuffle: bool,
) -> Result<()>
fn set_train_test_split_ratio( &mut self, ratio: f64, shuffle: bool, ) -> Result<()>
Source§fn set_train_test_split_ratio_def(&mut self, ratio: f64) -> Result<()>
fn set_train_test_split_ratio_def(&mut self, ratio: f64) -> Result<()>
fn shuffle_train_test(&mut self) -> Result<()>
Source§impl TrainDataTraitConst for TrainData
impl TrainDataTraitConst for TrainData
fn as_raw_TrainData(&self) -> *const c_void
fn get_layout(&self) -> Result<i32>
fn get_n_train_samples(&self) -> Result<i32>
fn get_n_test_samples(&self) -> Result<i32>
fn get_n_samples(&self) -> Result<i32>
fn get_n_vars(&self) -> Result<i32>
fn get_n_all_vars(&self) -> Result<i32>
fn get_sample( &self, var_idx: &impl ToInputArray, sidx: i32, buf: &mut f32, ) -> Result<()>
fn get_samples(&self) -> Result<Mat>
fn get_missing(&self) -> Result<Mat>
Source§fn get_train_samples(
&self,
layout: i32,
compress_samples: bool,
compress_vars: bool,
) -> Result<Mat>
fn get_train_samples( &self, layout: i32, compress_samples: bool, compress_vars: bool, ) -> Result<Mat>
Source§fn get_train_norm_cat_responses(&self) -> Result<Mat>
fn get_train_norm_cat_responses(&self) -> Result<Mat>
fn get_test_responses(&self) -> Result<Mat>
fn get_test_norm_cat_responses(&self) -> Result<Mat>
fn get_responses(&self) -> Result<Mat>
fn get_norm_cat_responses(&self) -> Result<Mat>
fn get_sample_weights(&self) -> Result<Mat>
fn get_train_sample_weights(&self) -> Result<Mat>
fn get_test_sample_weights(&self) -> Result<Mat>
fn get_var_idx(&self) -> Result<Mat>
fn get_var_type(&self) -> Result<Mat>
fn get_var_symbol_flags(&self) -> Result<Mat>
fn get_response_type(&self) -> Result<i32>
fn get_train_sample_idx(&self) -> Result<Mat>
fn get_test_sample_idx(&self) -> Result<Mat>
fn get_values( &self, vi: i32, sidx: &impl ToInputArray, values: &mut f32, ) -> Result<()>
fn get_norm_cat_values( &self, vi: i32, sidx: &impl ToInputArray, values: &mut i32, ) -> Result<()>
fn get_default_subst_values(&self) -> Result<Mat>
fn get_cat_count(&self, vi: i32) -> Result<i32>
fn get_cat_ofs(&self) -> Result<Mat>
fn get_cat_map(&self) -> Result<Mat>
Source§fn get_test_samples(&self) -> Result<Mat>
fn get_test_samples(&self) -> Result<Mat>
impl Send for TrainData
Auto Trait Implementations§
impl Freeze for TrainData
impl RefUnwindSafe for TrainData
impl !Sync for TrainData
impl Unpin for TrainData
impl UnwindSafe for TrainData
Blanket Implementations§
Source§impl<T> BorrowMut<T> for Twhere
T: ?Sized,
impl<T> BorrowMut<T> for Twhere
T: ?Sized,
Source§fn borrow_mut(&mut self) -> &mut T
fn borrow_mut(&mut self) -> &mut T
Source§impl<Mat> ModifyInplace for Matwhere
Mat: Boxed,
impl<Mat> ModifyInplace for Matwhere
Mat: Boxed,
Source§unsafe fn modify_inplace<Res>(
&mut self,
f: impl FnOnce(&Mat, &mut Mat) -> Res,
) -> Res
unsafe fn modify_inplace<Res>( &mut self, f: impl FnOnce(&Mat, &mut Mat) -> Res, ) -> Res
Mat
or another similar object. By passing
a mutable reference to the Mat
to this function your closure will get called with the read reference and a write references
to the same Mat
. This is unsafe in a general case as it leads to having non-exclusive mutable access to the internal data,
but it can be useful for some performance sensitive operations. One example of an OpenCV function that allows such in-place
modification is imgproc::threshold
. Read more