Struct TreeBoosterParametersBuilder

Source

pub struct TreeBoosterParametersBuilder { /* private fields */ }

Expand description

Builder for TreeBoosterParameters.

Implementations§

Source §

impl TreeBoosterParametersBuilder

Source

pub fn eta(&mut self, value: f32) -> &mut Self

Step size shrinkage used in update to prevents overfitting. After each boosting step, we can directly get the weights of new features, and eta actually shrinks the feature weights to make the boosting process more conservative.

range: [0.0, 1.0]
default: 0.3

Source

pub fn gamma(&mut self, value: f32) -> &mut Self

Minimum loss reduction required to make a further partition on a leaf node of the tree. The larger, the more conservative the algorithm will be.

range: [0,∞]
default: 0

Source

pub fn max_depth(&mut self, value: u32) -> &mut Self

Maximum depth of a tree, increase this value will make the model more complex / likely to be overfitting. 0 indicates no limit, limit is required for depth-wise grow policy.

range: [0,∞]
default: 6

Source

pub fn min_child_weight(&mut self, value: f32) -> &mut Self

Minimum sum of instance weight (hessian) needed in a child. If the tree partition step results in a leaf node with the sum of instance weight less than min_child_weight, then the building process will give up further partitioning. In linear regression mode, this simply corresponds to minimum number of instances needed to be in each node. The larger, the more conservative the algorithm will be.

range: [0,∞]
default: 1

Source

pub fn max_delta_step(&mut self, value: f32) -> &mut Self

Maximum delta step we allow each tree’s weight estimation to be. If the value is set to 0, it means there is no constraint. If it is set to a positive value, it can help making the update step more conservative. Usually this parameter is not needed, but it might help in logistic regression when class is extremely imbalanced. Set it to value of 1-10 might help control the update.

range: [0,∞]
default: 0

Source

pub fn subsample(&mut self, value: f32) -> &mut Self

Subsample ratio of the training instance. Setting it to 0.5 means that XGBoost randomly collected half of the data instances to grow trees and this will prevent overfitting.

range: (0, 1]
default: 1.0

Source

pub fn colsample_bytree(&mut self, value: f32) -> &mut Self

Subsample ratio of columns when constructing each tree.

range: (0.0, 1.0]
default: 1.0

Source

pub fn colsample_bylevel(&mut self, value: f32) -> &mut Self

Subsample ratio of columns for each split, in each level.

range: (0.0, 1.0]
default: 1.0

Source

pub fn colsample_bynode(&mut self, value: f32) -> &mut Self

Subsample ratio of columns for each node.

range: (0.0, 1.0]
default: 1.0

Source

pub fn lambda(&mut self, value: f32) -> &mut Self

L2 regularization term on weights, increase this value will make model more conservative.

default: 1

Source

pub fn alpha(&mut self, value: f32) -> &mut Self

L1 regularization term on weights, increase this value will make model more conservative.

default: 0

Source

pub fn tree_method(&mut self, value: TreeMethod) -> &mut Self

The tree construction algorithm used in XGBoost.

Source

pub fn sketch_eps(&mut self, value: f32) -> &mut Self

This is only used for approximate greedy algorithm. This roughly translated into O(1 / sketch_eps) number of bins. Compared to directly select number of bins, this comes with theoretical guarantee with sketch accuracy. Usually user does not have to tune this. but consider setting to a lower number for more accurate enumeration.

range: (0.0, 1.0)
default: 0.03

Source

pub fn scale_pos_weight(&mut self, value: f32) -> &mut Self

Control the balance of positive and negative weights, useful for unbalanced classes. A typical value to consider: sum(negative cases) / sum(positive cases).

default: 1.0

Source

pub fn updater(&mut self, value: Vec<TreeUpdater>) -> &mut Self

Sequence of tree updaters to run, providing a modular way to construct and to modify the trees.

default: vec![]

Source

pub fn refresh_leaf(&mut self, value: bool) -> &mut Self

This is a parameter of the ‘refresh’ updater plugin. When this flag is true, tree leafs as well as tree nodes’ stats are updated. When it is false, only node stats are updated.

default: true

Source

pub fn process_type(&mut self, value: ProcessType) -> &mut Self

A type of boosting process to run.

default: ProcessType::Default

Source

pub fn grow_policy(&mut self, value: GrowPolicy) -> &mut Self

Controls a way new nodes are added to the tree. Currently supported only if tree_method is set to ‘hist’.

Source

pub fn max_leaves(&mut self, value: u32) -> &mut Self

Maximum number of nodes to be added. Only relevant for the GrowPolicy::LossGuide grow policy.

default: 0

Source

pub fn max_bin(&mut self, value: u32) -> &mut Self

This is only used if ‘hist’ is specified as tree_method. Maximum number of discrete bins to bucket continuous features. Increasing this number improves the optimality of splits at the cost of higher computation time.