Dataframe

Struct Dataframe 

Source
pub struct Dataframe { /* private fields */ }
Expand description

Dataframe that represents a collection of columns of different data types.

Used for managing data in an efficient way.

Implementations§

Source§

impl Dataframe

Source

pub fn from_csv(path: String) -> Result<Self, ()>

Reads data from a CSV file using a semicolon as the delimiter, and creates a Dataframe

§Examples
use rustic_ml::data_utils::dataframe::Dataframe;

let path = String::from("./datasets/european_cities.csv");
let dataframe = Dataframe::from_csv(path).unwrap();
§Arguments:
  • path: The path parameter is a String that represents the file path to a CSV file that you want to read from.
§Errors:
  • When file is not found, path was not correct
§Returns:

The from_csv function is returning a Result containing either an instance of the struct it belongs to (represented by Self) or an empty tuple ().

Source

pub fn from_file(path: String, delimiter: char) -> Result<Self, ()>

Reads data from a file using the given delimiter, and creates a Dataframe

§Examples
use rustic_ml::data_utils::dataframe::Dataframe;

let path = String::from("./datasets/european_cities.txt");
let dataframe = Dataframe::from_file(path, ' ').unwrap();
§Arguments:
  • path: The path parameter is a String that represents the file path to a CSV file that you want to read from.
  • ‘delimiter’: The delimiter that septate records
§Errors:
  • When file is not found, path was not correct
§Returns:

The from_csv function is returning a Result containing either an instance of the struct it belongs to (represented by Self) or an empty tuple ().

Source

pub fn to_csv(&self, _path: String) -> Result<(), ()>

Source

pub fn column_names(&self) -> Vec<String>

Get all the column names for the Dataframe

§Examples
use rustic_ml::data_utils::dataframe::Dataframe;

let path = String::from("./datasets/european_cities.csv");
let dataframe = Dataframe::from_csv(path).unwrap();

let columns = dataframe.column_names();
assert!(columns == vec!["Barcelona","Belgrade","Berlin","Brussels", "Bucharest","Budapest","Copenhagen","Dublin","Hamburg","Istanbul","Kyiv","London","Madrid","Milan","Moscow","Munich","Paris","Prague","Rome","Saint Petersburg","Sofia","Stockholm","Vienna","Warsaw"])
Source

pub fn print(&self)

Source

pub fn print_full_table(&self)

Source

pub fn rename_column(&mut self, index: usize, column_name: &str)

Rename the column at given index to a new column name

§Example
use rustic_ml::data_utils::dataframe::Dataframe;

let path = String::from("./datasets/european_cities.csv");
let mut dataframe = Dataframe::from_csv(path).unwrap();

assert!(dataframe.has_column("Barcelona"));
assert!(!dataframe.has_column("Oslo"));

dataframe.rename_column(0, "Oslo");
assert!(dataframe.has_column("Oslo"));
§Errors

This method does not throw any error. If there is not column at given index, it does nothing. Assume that given column is renamed, if a valid index is given.

Source

pub fn head(&self)

Print the first 5 rows of the Dataframe.

If the Dataframe has less then 5 rows, then it prints the whole Dataframe. Note that current implementation does not take into account the terminal width.

§Examples
use rustic_ml::data_utils::dataframe::Dataframe;

let path = String::from("./datasets/european_cities.csv");
let dataframe = Dataframe::from_csv(path).unwrap();

dataframe.head();
§Errors

Does create an error. If the dataframe is empty, then it will print a information string

Source

pub fn tail(&self)

Print the last 5 rows of the Dataframe.

If the Dataframe has less then 5 rows, then it prints the whole Dataframe

Source

pub fn info(&self)

Prints information about columns in the Dataframe

Print information about each column. For each column it prints the following information:

  • column name
  • type
  • counts of None
  • count of Some values
  • Total length of rows.
Source

pub fn memory_usage(&self) -> usize

Calculate the total memory used for the Dataframe

§Example
use rustic_ml::data_utils::dataframe::Dataframe;

let path = String::from("./datasets/european_cities.csv");
let dataframe = Dataframe::from_csv(path).unwrap();
assert!(dataframe.memory_usage() == 4608);
§Returns:

The total memory usage of all columns in the Dataframe in bytes.

Source

pub fn has_rows(&self) -> bool

Check if the Dataframe has rows.

§Example
use rustic_ml::data_utils::dataframe::Dataframe;

let path = String::from("./datasets/european_cities.csv");
let dataframe = Dataframe::from_csv(path).unwrap();

assert!(dataframe.has_rows());
§Returns

Returns true if there are rows, that could be None, rows.

Source

pub fn has_records(&self) -> bool

Check if the Dataframe any records.

Record is a line with no None values. Use

§Example
use rustic_ml::data_utils::dataframe::Dataframe;

let path = String::from("./datasets/european_cities.csv");
let dataframe = Dataframe::from_csv(path).unwrap();

assert!(dataframe.has_records());
§Returns

Returns true if there are rows, that could be None, rows.

Source

pub fn has_columns(&self) -> bool

Check if the Dataframe has columns defined.

§Example
use rustic_ml::data_utils::dataframe::Dataframe;

let path = String::from("./datasets/european_cities.csv");
let dataframe = Dataframe::from_csv(path).unwrap();
assert!(dataframe.has_columns());

Returns true if there is at least one DataColumn

Source

pub fn has_column(&self, column_name: &str) -> bool

Check if a column with given column name exists in the Dataframe

§Example
use rustic_ml::data_utils::dataframe::Dataframe;

let path = String::from("./datasets/european_cities.csv");
let dataframe = Dataframe::from_csv(path).unwrap();

assert!(dataframe.has_column("Barcelona"));
assert!(!dataframe.has_column("Oslo"));
§Returns

True if there is a column that has the given column name

Source

pub fn drop_column(&mut self, column_name: &str)

Drop the column with the given column name

Method is not verbose, and therefor assume that the column was removed, or that it never existed.

§Example
use rustic_ml::data_utils::dataframe::Dataframe;

let path = String::from("./datasets/european_cities.csv");
let mut dataframe = Dataframe::from_csv(path).unwrap();

assert!(dataframe.has_column("Barcelona"));

dataframe.drop_column("Barcelona");
assert!(!dataframe.has_column("Barcelona"));
Source

pub fn add_column<T: ToString>(&mut self, list: Vec<T>, column_name: &str)

Add a new column to the Dataframe

§Example
use rustic_ml::data_utils::dataframe::Dataframe;

let path = String::from("./datasets/european_cities.csv");
let mut dataframe = Dataframe::from_csv(path).unwrap();

dataframe.add_column(vec![1, 2, 3, 4], "custom_index_column");
Source

pub fn add_record(&self)

Source

pub fn get_column_type(&self, column_name: &str) -> Option<ColumnType>

Get the ColumnType for a given column.

§Example
use rustic_ml::data_utils::dataframe::Dataframe;
use rustic_ml::data_utils::dataframe::ColumnType;

let path = String::from("./datasets/european_cities.csv");
let dataframe = Dataframe::from_csv(path).unwrap();
assert!(dataframe.has_column("Barcelona"));
assert!(!dataframe.has_column("Oslo"));

assert!(dataframe.get_column_type("Barcelona") == Some(ColumnType::Float));
assert!(dataframe.get_column_type("Oslo") == None);        
§Returns

Returns Ǹone if no column had given name, or the ColumnType of the column with the given name.

Source

pub fn float_feature(&self, column_name: &str) -> Option<Vec<Option<f32>>>

Extract a single feature of floats into a Vec<Option<f32>>

Creates a clone of the column. Values within the vector might be None. Use the column name to identify the column that will be extracted.

Source

pub fn float_features( &self, first_column_name: &str, second_column_name: &str, ) -> Option<Vec<Option<(f32, f32)>>>

Extract two sets of features into a single vector of tuples (Vec<Option<(f32, f32)>>).

Creates a clone of the column. Values within the vector might be None. A row in the vector is None, if one of the vectors are none. Use the column name to identify the column that will be extracted.

§Returns

Returns a Vec<Option<(f32, f32)>> created from the two features. Returns None if the two feature vectors are not the same length or of any vector did not exist.

Source

pub fn at_str(&self, column_name: &str, row_index: usize) -> Option<String>

Get the value at given column and given row index.

§Returns

The value as a String or None if:

  • there was no column with that name
  • the given row index was out of bounce
  • the value at that entry was None
Source

pub fn at_index_str(&self) -> Option<&str>

Auto Trait Implementations§

Blanket Implementations§

Source§

impl<T> Any for T
where T: 'static + ?Sized,

Source§

fn type_id(&self) -> TypeId

Gets the TypeId of self. Read more
Source§

impl<T> Borrow<T> for T
where T: ?Sized,

Source§

fn borrow(&self) -> &T

Immutably borrows from an owned value. Read more
Source§

impl<T> BorrowMut<T> for T
where T: ?Sized,

Source§

fn borrow_mut(&mut self) -> &mut T

Mutably borrows from an owned value. Read more
Source§

impl<T> From<T> for T

Source§

fn from(t: T) -> T

Returns the argument unchanged.

Source§

impl<T, U> Into<U> for T
where U: From<T>,

Source§

fn into(self) -> U

Calls U::from(self).

That is, this conversion is whatever the implementation of From<T> for U chooses to do.

Source§

impl<T, U> TryFrom<U> for T
where U: Into<T>,

Source§

type Error = Infallible

The type returned in the event of a conversion error.
Source§

fn try_from(value: U) -> Result<T, <T as TryFrom<U>>::Error>

Performs the conversion.
Source§

impl<T, U> TryInto<U> for T
where U: TryFrom<T>,

Source§

type Error = <U as TryFrom<T>>::Error

The type returned in the event of a conversion error.
Source§

fn try_into(self) -> Result<U, <U as TryFrom<T>>::Error>

Performs the conversion.
Source§

impl<V, T> VZip<V> for T
where V: MultiLane<T>,

Source§

fn vzip(self) -> V