[−][src]Module smartcore::tree::decision_tree_regressor
Regression tree for for dependent variables that take continuous or ordered discrete values.
Decision Tree Regressor
The process of building a decision tree can be simplified to these two steps:
- Divide the predictor space \(X\) into K distinct and non-overlapping regions, \(R_1, R_2, ..., R_K\).
- For every observation that falls into the region \(R_k\), we make the same prediction, which is simply the mean of the response values for the training observations in \(R_k\).
Regions \(R_1, R_2, ..., R_K\) are build in such a way that minimizes the residual sum of squares (RSS) given by
\[RSS = \sum_{k=1}^K\sum_{i \in R_k} (y_i - \hat{y}_{Rk})^2\]
where \(\hat{y}_{Rk}\) is the mean response for the training observations withing region k.
SmartCore uses recursive binary splitting approach to build \(R_1, R_2, ..., R_K\) regions. The approach begins at the top of the tree and then successively splits the predictor space one predictor at a time. At each step of the tree-building process, the best split is made at that particular step, rather than looking ahead and picking a split that will lead to a better tree in some future step.
Example:
use smartcore::linalg::naive::dense_matrix::*; use smartcore::tree::decision_tree_regressor::*; // Longley dataset (https://www.statsmodels.org/stable/datasets/generated/longley.html) let x = DenseMatrix::from_2d_array(&[ &[234.289, 235.6, 159., 107.608, 1947., 60.323], &[259.426, 232.5, 145.6, 108.632, 1948., 61.122], &[258.054, 368.2, 161.6, 109.773, 1949., 60.171], &[284.599, 335.1, 165., 110.929, 1950., 61.187], &[328.975, 209.9, 309.9, 112.075, 1951., 63.221], &[346.999, 193.2, 359.4, 113.27, 1952., 63.639], &[365.385, 187., 354.7, 115.094, 1953., 64.989], &[363.112, 357.8, 335., 116.219, 1954., 63.761], &[397.469, 290.4, 304.8, 117.388, 1955., 66.019], &[419.18, 282.2, 285.7, 118.734, 1956., 67.857], &[442.769, 293.6, 279.8, 120.445, 1957., 68.169], &[444.546, 468.1, 263.7, 121.95, 1958., 66.513], &[482.704, 381.3, 255.2, 123.366, 1959., 68.655], &[502.601, 393.1, 251.4, 125.368, 1960., 69.564], &[518.173, 480.6, 257.2, 127.852, 1961., 69.331], &[554.894, 400.7, 282.7, 130.081, 1962., 70.551], ]); let y: Vec<f64> = vec![ 83.0, 88.5, 88.2, 89.5, 96.2, 98.1, 99.0, 100.0, 101.2, 104.6, 108.4, 110.8, 112.6, 114.2, 115.7, 116.9, ]; let tree = DecisionTreeRegressor::fit(&x, &y, Default::default()).unwrap(); let y_hat = tree.predict(&x).unwrap(); // use the same data for prediction
References:
Structs
DecisionTreeRegressor | Regression Tree |
DecisionTreeRegressorParameters | Parameters of Regression Tree |