Struct DMatrix

Source
pub struct DMatrix { /* private fields */ }
Expand description

Data matrix used throughout XGBoost for training/predicting Booster models.

It’s used as a container for both features (i.e. a row for every instance), and an optional true label for that instance (as an f32 value).

Can be created files, or from dense or sparse (CSR or CSC) matrices.

§Examples

§Load from file

Load matrix from file in LIBSVM or binary format.

use xgboost::DMatrix;

let dmat = DMatrix::load("somefile.txt").unwrap();

§Create from dense array

use xgboost::DMatrix;

let data = &[1.0, 0.5, 0.2, 0.2,
             0.7, 1.0, 0.1, 0.1,
             0.2, 0.0, 0.0, 1.0];
let num_rows = 3;
let mut dmat = DMatrix::from_dense(data, num_rows).unwrap();
assert_eq!(dmat.shape(), (3, 4));

// set true labels for each row
dmat.set_labels(&[1.0, 0.0, 1.0]);

§Create from sparse CSR matrix

Create from sparse representation of

[[1.0, 0.0, 2.0],
 [0.0, 0.0, 3.0],
 [4.0, 5.0, 6.0]]
use xgboost::DMatrix;

let indptr = &[0, 2, 3, 6];
let indices = &[0, 2, 2, 0, 1, 2];
let data = &[1.0, 2.0, 3.0, 4.0, 5.0, 6.0];
let dmat = DMatrix::from_csr(indptr, indices, data, None).unwrap();
assert_eq!(dmat.shape(), (3, 3));

Implementations§

Source§

impl DMatrix

Source

pub fn from_dense(data: &[f32], num_rows: usize) -> XGBResult<Self>

Create a new DMatrix from dense array in row-major order.

E.g. the matrix

[[1.0, 2.0],
 [3.0, 4.0],
 [5.0, 6.0]]

would be represented converted into a DMatrix with

use xgboost::DMatrix;

let data = &[1.0, 2.0, 3.0, 4.0, 5.0, 6.0];
let num_rows = 3;
let dmat = DMatrix::from_dense(data, num_rows).unwrap();
Source

pub fn from_csr( indptr: &[usize], indices: &[usize], data: &[f32], num_cols: Option<usize>, ) -> XGBResult<Self>

Create a new DMatrix from a sparse CSR matrix.

Uses standard CSR representation where the column indices for row i are stored in indices[indptr[i]:indptr[i+1]] and their corresponding values are stored in data[indptr[i]:indptr[i+1].

If num_cols is set to None, number of columns will be inferred from given data.

Source

pub fn from_csc( indptr: &[usize], indices: &[usize], data: &[f32], num_rows: Option<usize>, ) -> XGBResult<Self>

Create a new DMatrix from a sparse CSC) matrix.

Uses standard CSC representation where the row indices for column i are stored in indices[indptr[i]:indptr[i+1]] and their corresponding values are stored in data[indptr[i]:indptr[i+1].

If num_rows is set to None, number of rows will be inferred from given data.

Source

pub fn load<P: AsRef<Path>>(path: P) -> XGBResult<Self>

Create a new DMatrix from given file.

Supports text files in LIBSVM format, CSV, binary files written either by save, or from another XGBoost library.

For more details on accepted formats, seem the XGBoost input format documentation.

§LIBSVM format

Specified data in a sparse format as:

<label> <index>:<value> [<index>:<value> ...]

E.g.

0 1:1 9:0 11:0
1 9:1 11:0.375 15:1
0 1:0 8:0.22 11:1
Source

pub fn save<P: AsRef<Path>>(&self, path: P) -> XGBResult<()>

Serialise this DMatrix as a binary file to given path.

Source

pub fn num_rows(&self) -> usize

Get the number of rows in this matrix.

Source

pub fn num_cols(&self) -> usize

Get the number of columns in this matrix.

Source

pub fn shape(&self) -> (usize, usize)

Get the shape (rows x columns) of this matrix.

Source

pub fn slice(&self, indices: &[usize]) -> XGBResult<DMatrix>

Get a new DMatrix as a containing only given indices.

Source

pub fn get_root_index(&self) -> XGBResult<&[u32]>

Gets the specified root index of each instance, can be used for multi task setting.

See the XGBoost documentation for more information.

Source

pub fn set_root_index(&mut self, array: &[u32]) -> XGBResult<()>

Sets the specified root index of each instance, can be used for multi task setting.

See the XGBoost documentation for more information.

Source

pub fn get_labels(&self) -> XGBResult<&[f32]>

Get ground truth labels for each row of this matrix.

Source

pub fn set_labels(&mut self, array: &[f32]) -> XGBResult<()>

Set ground truth labels for each row of this matrix.

Source

pub fn get_weights(&self) -> XGBResult<&[f32]>

Get weights of each instance.

Source

pub fn set_weights(&mut self, array: &[f32]) -> XGBResult<()>

Set weights of each instance.

Source

pub fn get_base_margin(&self) -> XGBResult<&[f32]>

Get base margin.

Source

pub fn set_base_margin(&mut self, array: &[f32]) -> XGBResult<()>

Set base margin.

If specified, xgboost will start from this margin, can be used to specify initial prediction to boost from.

Source

pub fn set_group(&mut self, group: &[u32]) -> XGBResult<()>

Set the index for the beginning and end of a group.

Needed when the learning task is ranking.

See the XGBoost documentation for more information.

Trait Implementations§

Source§

impl Drop for DMatrix

Source§

fn drop(&mut self)

Executes the destructor for this type. Read more

Auto Trait Implementations§

Blanket Implementations§

Source§

impl<T> Any for T
where T: 'static + ?Sized,

Source§

fn type_id(&self) -> TypeId

Gets the TypeId of self. Read more
Source§

impl<T> Borrow<T> for T
where T: ?Sized,

Source§

fn borrow(&self) -> &T

Immutably borrows from an owned value. Read more
Source§

impl<T> BorrowMut<T> for T
where T: ?Sized,

Source§

fn borrow_mut(&mut self) -> &mut T

Mutably borrows from an owned value. Read more
Source§

impl<T> From<T> for T

Source§

fn from(t: T) -> T

Returns the argument unchanged.

Source§

impl<T, U> Into<U> for T
where U: From<T>,

Source§

fn into(self) -> U

Calls U::from(self).

That is, this conversion is whatever the implementation of From<T> for U chooses to do.

Source§

impl<T, U> TryFrom<U> for T
where U: Into<T>,

Source§

type Error = Infallible

The type returned in the event of a conversion error.
Source§

fn try_from(value: U) -> Result<T, <T as TryFrom<U>>::Error>

Performs the conversion.
Source§

impl<T, U> TryInto<U> for T
where U: TryFrom<T>,

Source§

type Error = <U as TryFrom<T>>::Error

The type returned in the event of a conversion error.
Source§

fn try_into(self) -> Result<U, <U as TryFrom<T>>::Error>

Performs the conversion.