pub struct DMatrix { /* private fields */ }
Expand description
Data matrix used throughout XGBoost for training/predicting Booster
models.
It’s used as a container for both features (i.e. a row for every instance), and an optional true label for that
instance (as an f32
value).
Can be created files, or from dense or sparse (CSR or CSC) matrices.
§Examples
§Load from file
Load matrix from file in LIBSVM or binary format.
use xgboost::DMatrix;
let dmat = DMatrix::load("somefile.txt").unwrap();
§Create from dense array
use xgboost::DMatrix;
let data = &[1.0, 0.5, 0.2, 0.2,
0.7, 1.0, 0.1, 0.1,
0.2, 0.0, 0.0, 1.0];
let num_rows = 3;
let mut dmat = DMatrix::from_dense(data, num_rows).unwrap();
assert_eq!(dmat.shape(), (3, 4));
// set true labels for each row
dmat.set_labels(&[1.0, 0.0, 1.0]);
§Create from sparse CSR matrix
Create from sparse representation of
[[1.0, 0.0, 2.0],
[0.0, 0.0, 3.0],
[4.0, 5.0, 6.0]]
use xgboost::DMatrix;
let indptr = &[0, 2, 3, 6];
let indices = &[0, 2, 2, 0, 1, 2];
let data = &[1.0, 2.0, 3.0, 4.0, 5.0, 6.0];
let dmat = DMatrix::from_csr(indptr, indices, data, None).unwrap();
assert_eq!(dmat.shape(), (3, 3));
Implementations§
Source§impl DMatrix
impl DMatrix
Sourcepub fn from_dense(data: &[f32], num_rows: usize) -> XGBResult<Self>
pub fn from_dense(data: &[f32], num_rows: usize) -> XGBResult<Self>
Create a new DMatrix
from dense array in row-major order.
E.g. the matrix
[[1.0, 2.0],
[3.0, 4.0],
[5.0, 6.0]]
would be represented converted into a DMatrix
with
use xgboost::DMatrix;
let data = &[1.0, 2.0, 3.0, 4.0, 5.0, 6.0];
let num_rows = 3;
let dmat = DMatrix::from_dense(data, num_rows).unwrap();
Sourcepub fn from_csr(
indptr: &[usize],
indices: &[usize],
data: &[f32],
num_cols: Option<usize>,
) -> XGBResult<Self>
pub fn from_csr( indptr: &[usize], indices: &[usize], data: &[f32], num_cols: Option<usize>, ) -> XGBResult<Self>
Create a new DMatrix
from a sparse
CSR matrix.
Uses standard CSR representation where the column indices for row i are stored in
indices[indptr[i]:indptr[i+1]]
and their corresponding values are stored in
data[indptr[i]:indptr[i+1]
.
If num_cols
is set to None, number of columns will be inferred from given data.
Sourcepub fn from_csc(
indptr: &[usize],
indices: &[usize],
data: &[f32],
num_rows: Option<usize>,
) -> XGBResult<Self>
pub fn from_csc( indptr: &[usize], indices: &[usize], data: &[f32], num_rows: Option<usize>, ) -> XGBResult<Self>
Create a new DMatrix
from a sparse
CSC) matrix.
Uses standard CSC representation where the row indices for column i are stored in
indices[indptr[i]:indptr[i+1]]
and their corresponding values are stored in
data[indptr[i]:indptr[i+1]
.
If num_rows
is set to None, number of rows will be inferred from given data.
Sourcepub fn load<P: AsRef<Path>>(path: P) -> XGBResult<Self>
pub fn load<P: AsRef<Path>>(path: P) -> XGBResult<Self>
Create a new DMatrix
from given file.
Supports text files in LIBSVM format, CSV,
binary files written either by save
, or from another XGBoost library.
For more details on accepted formats, seem the XGBoost input format documentation.
§LIBSVM format
Specified data in a sparse format as:
<label> <index>:<value> [<index>:<value> ...]
E.g.
0 1:1 9:0 11:0
1 9:1 11:0.375 15:1
0 1:0 8:0.22 11:1
Sourcepub fn save<P: AsRef<Path>>(&self, path: P) -> XGBResult<()>
pub fn save<P: AsRef<Path>>(&self, path: P) -> XGBResult<()>
Serialise this DMatrix
as a binary file to given path.
Sourcepub fn slice(&self, indices: &[usize]) -> XGBResult<DMatrix>
pub fn slice(&self, indices: &[usize]) -> XGBResult<DMatrix>
Get a new DMatrix as a containing only given indices.
Sourcepub fn get_root_index(&self) -> XGBResult<&[u32]>
pub fn get_root_index(&self) -> XGBResult<&[u32]>
Gets the specified root index of each instance, can be used for multi task setting.
See the XGBoost documentation for more information.
Sourcepub fn set_root_index(&mut self, array: &[u32]) -> XGBResult<()>
pub fn set_root_index(&mut self, array: &[u32]) -> XGBResult<()>
Sets the specified root index of each instance, can be used for multi task setting.
See the XGBoost documentation for more information.
Sourcepub fn get_labels(&self) -> XGBResult<&[f32]>
pub fn get_labels(&self) -> XGBResult<&[f32]>
Get ground truth labels for each row of this matrix.
Sourcepub fn set_labels(&mut self, array: &[f32]) -> XGBResult<()>
pub fn set_labels(&mut self, array: &[f32]) -> XGBResult<()>
Set ground truth labels for each row of this matrix.
Sourcepub fn get_weights(&self) -> XGBResult<&[f32]>
pub fn get_weights(&self) -> XGBResult<&[f32]>
Get weights of each instance.
Sourcepub fn set_weights(&mut self, array: &[f32]) -> XGBResult<()>
pub fn set_weights(&mut self, array: &[f32]) -> XGBResult<()>
Set weights of each instance.
Sourcepub fn get_base_margin(&self) -> XGBResult<&[f32]>
pub fn get_base_margin(&self) -> XGBResult<&[f32]>
Get base margin.
Sourcepub fn set_base_margin(&mut self, array: &[f32]) -> XGBResult<()>
pub fn set_base_margin(&mut self, array: &[f32]) -> XGBResult<()>
Set base margin.
If specified, xgboost will start from this margin, can be used to specify initial prediction to boost from.