Struct forust_ml::gradientbooster::GradientBooster
source · [−]pub struct GradientBooster {Show 15 fields
pub objective_type: ObjectiveType,
pub iterations: usize,
pub learning_rate: f32,
pub max_depth: usize,
pub max_leaves: usize,
pub l2: f32,
pub gamma: f32,
pub min_leaf_weight: f32,
pub base_score: f64,
pub nbins: u16,
pub parallel: bool,
pub allow_missing_splits: bool,
pub impute_missing: bool,
pub monotone_constraints: Option<ConstraintMap>,
pub trees: Vec<Tree>,
}Expand description
Gradient Booster object
objective_type- The name of objective function used to optimize. Valid options include “LogLoss” to use logistic loss as the objective function, or “SquaredLoss” to use Squared Error as the objective function.iterations- Total number of trees to train in the ensemble.learning_rate- Step size to use at each iteration. Each leaf weight is multiplied by this number. The smaller the value, the more conservative the weights will be.max_depth- Maximum depth of an individual tree. Valid values are 0 to infinity.max_leaves- Maximum number of leaves allowed on a tree. Valid values are 0 to infinity. This is the total number of final nodes.l2- L2 regularization term applied to the weights of the tree. Valid values are 0 to infinity.gamma- The minimum amount of loss required to further split a node. Valid values are 0 to infinity.min_leaf_weight- Minimum sum of the hessian values of the loss function required to be in a node.base_score- The initial prediction value of the model.nbins- Number of bins to calculate to partition the data. Setting this to a smaller number, will result in faster training time, while potentially sacrificing accuracy. If there are more bins, than unique values in a column, all unique values will be used.allow_missing_splits- Allow for splits to be made such that all missing values go down one branch, and all non-missing values go down the other, if this results in the greatest reduction of loss. If this is false, splits will only be made on non missing values.impute_missing- Automatically impute missing values, such that at every split the model learns the best direction to send missing values. If this is false, all missing values will default to right branch.
Fields
objective_type: ObjectiveTypeiterations: usizelearning_rate: f32max_depth: usizemax_leaves: usizel2: f32gamma: f32min_leaf_weight: f32base_score: f64nbins: u16parallel: boolallow_missing_splits: boolimpute_missing: boolmonotone_constraints: Option<ConstraintMap>trees: Vec<Tree>Implementations
sourceimpl GradientBooster
impl GradientBooster
sourcepub fn new(
objective_type: ObjectiveType,
iterations: usize,
learning_rate: f32,
max_depth: usize,
max_leaves: usize,
l2: f32,
gamma: f32,
min_leaf_weight: f32,
base_score: f64,
nbins: u16,
parallel: bool,
allow_missing_splits: bool,
impute_missing: bool,
monotone_constraints: Option<ConstraintMap>
) -> Self
pub fn new(
objective_type: ObjectiveType,
iterations: usize,
learning_rate: f32,
max_depth: usize,
max_leaves: usize,
l2: f32,
gamma: f32,
min_leaf_weight: f32,
base_score: f64,
nbins: u16,
parallel: bool,
allow_missing_splits: bool,
impute_missing: bool,
monotone_constraints: Option<ConstraintMap>
) -> Self
Gradient Booster object
objective_type- The name of objective function used to optimize. Valid options include “LogLoss” to use logistic loss as the objective function, or “SquaredLoss” to use Squared Error as the objective function.iterations- Total number of trees to train in the ensemble.learning_rate- Step size to use at each iteration. Each leaf weight is multiplied by this number. The smaller the value, the more conservative the weights will be.max_depth- Maximum depth of an individual tree. Valid values are 0 to infinity.max_leaves- Maximum number of leaves allowed on a tree. Valid values are 0 to infinity. This is the total number of final nodes.l2- L2 regularization term applied to the weights of the tree. Valid values are 0 to infinity.gamma- The minimum amount of loss required to further split a node. Valid values are 0 to infinity.min_leaf_weight- Minimum sum of the hessian values of the loss function required to be in a node.base_score- The initial prediction value of the model.nbins- Number of bins to calculate to partition the data. Setting this to a smaller number, will result in faster training time, while potentially sacrificing accuracy. If there are more bins, than unique values in a column, all unique values will be used.monotone_constraints- Constraints that are used to enforce a specific relationship between the training features and the target variable.
sourcepub fn fit(
&mut self,
data: &Matrix<'_, f64>,
y: &[f64],
sample_weight: &[f64]
) -> Result<(), ForustError>
pub fn fit(
&mut self,
data: &Matrix<'_, f64>,
y: &[f64],
sample_weight: &[f64]
) -> Result<(), ForustError>
Fit the gradient booster on a provided dataset.
data- Either a pandas DataFrame, or a 2 dimensional numpy array.y- Either a pandas Series, or a 1 dimensional numpy array.sample_weight- Instance weights to use when training the model. If None is passed, a weight of 1 will be used for every record.
sourcepub fn fit_unweighted(
&mut self,
data: &Matrix<'_, f64>,
y: &[f64]
) -> Result<(), ForustError>
pub fn fit_unweighted(
&mut self,
data: &Matrix<'_, f64>,
y: &[f64]
) -> Result<(), ForustError>
Fit the gradient booster on a provided dataset without any weights.
data- Either a pandas DataFrame, or a 2 dimensional numpy array.y- Either a pandas Series, or a 1 dimensional numpy array.
sourcepub fn predict(&self, data: &Matrix<'_, f64>, parallel: bool) -> Vec<f64>
pub fn predict(&self, data: &Matrix<'_, f64>, parallel: bool) -> Vec<f64>
Generate predictions on data using the gradient booster.
data- Either a pandas DataFrame, or a 2 dimensional numpy array.
sourcepub fn value_partial_dependence(&self, feature: usize, value: f64) -> f64
pub fn value_partial_dependence(&self, feature: usize, value: f64) -> f64
Given a value, return the partial dependence value of that value for that feature in the model.
feature- The index of the feature.value- The value for which to calculate the partial dependence.
sourcepub fn save_booster(&self, path: &str) -> Result<(), ForustError>
pub fn save_booster(&self, path: &str) -> Result<(), ForustError>
Save a booster as a json object to a file.
path- Path to save booster.
sourcepub fn from_json(json_str: &str) -> Result<Self, ForustError>
pub fn from_json(json_str: &str) -> Result<Self, ForustError>
Load a booster from Json string
json_str- String object, which can be serialized to json.
sourcepub fn load_booster(path: &str) -> Result<Self, ForustError>
pub fn load_booster(path: &str) -> Result<Self, ForustError>
Load a booster from a path to a json booster object.
path- Path to load booster from.
sourcepub fn set_objective_type(self, objective_type: ObjectiveType) -> Self
pub fn set_objective_type(self, objective_type: ObjectiveType) -> Self
Set the objective_type on the booster.
objective_type- The objective type of the booster.
sourcepub fn set_iterations(self, iterations: usize) -> Self
pub fn set_iterations(self, iterations: usize) -> Self
Set the iterations on the booster.
iterations- The number of iterations of the booster.
sourcepub fn set_learning_rate(self, learning_rate: f32) -> Self
pub fn set_learning_rate(self, learning_rate: f32) -> Self
Set the learning_rate on the booster.
learning_rate- The learning rate of the booster.
sourcepub fn set_max_depth(self, max_depth: usize) -> Self
pub fn set_max_depth(self, max_depth: usize) -> Self
Set the max_depth on the booster.
max_depth- The maximum tree depth of the booster.
sourcepub fn set_max_leaves(self, max_leaves: usize) -> Self
pub fn set_max_leaves(self, max_leaves: usize) -> Self
Set the max_leaves on the booster.
max_leaves- The maximum number of leaves of the booster.
sourcepub fn set_l2(self, l2: f32) -> Self
pub fn set_l2(self, l2: f32) -> Self
Set the l2 on the booster.
l2- The l2 regulation term of the booster.
sourcepub fn set_gamma(self, gamma: f32) -> Self
pub fn set_gamma(self, gamma: f32) -> Self
Set the gamma on the booster.
gamma- The gamma value of the booster.
sourcepub fn set_min_leaf_weight(self, min_leaf_weight: f32) -> Self
pub fn set_min_leaf_weight(self, min_leaf_weight: f32) -> Self
Set the min_leaf_weight on the booster.
min_leaf_weight- The minimum sum of the hession values allowed in the node of a tree of the booster.
sourcepub fn set_base_score(self, base_score: f64) -> Self
pub fn set_base_score(self, base_score: f64) -> Self
Set the base_score on the booster.
base_score- The base score of the booster.
sourcepub fn set_nbins(self, nbins: u16) -> Self
pub fn set_nbins(self, nbins: u16) -> Self
Set the nbins on the booster.
nbins- The nummber of bins used for partitioning the data of the booster.
sourcepub fn set_parallel(self, parallel: bool) -> Self
pub fn set_parallel(self, parallel: bool) -> Self
Set the parallel on the booster.
parallel- Set if the booster should be trained in parallels.
sourcepub fn set_allow_missing_splits(self, allow_missing_splits: bool) -> Self
pub fn set_allow_missing_splits(self, allow_missing_splits: bool) -> Self
Set the allow_missing_splits on the booster.
allow_missing_splits- Set if missing splits are allowed for the booster.
sourcepub fn set_impute_missing(self, impute_missing: bool) -> Self
pub fn set_impute_missing(self, impute_missing: bool) -> Self
Set the impute_missing on the booster.
impute_missing- Set if missing values should be imputed when training the booster.
sourcepub fn set_monotone_constraints(
self,
monotone_constraints: Option<ConstraintMap>
) -> Self
pub fn set_monotone_constraints(
self,
monotone_constraints: Option<ConstraintMap>
) -> Self
Set the monotone_constraints on the booster.
monotone_constraints- The monotone constraints of the booster.