pub struct Dataframe { /* private fields */ }Expand description
Dataframe that represents a collection of columns of different data types.
Used for managing data in an efficient way.
Implementations§
Source§impl Dataframe
impl Dataframe
Sourcepub fn from_csv(path: String) -> Result<Self, ()>
pub fn from_csv(path: String) -> Result<Self, ()>
Reads data from a CSV file using a semicolon as the delimiter, and creates a Dataframe
§Examples
use rustic_ml::data_utils::dataframe::Dataframe;
let path = String::from("./datasets/european_cities.csv");
let dataframe = Dataframe::from_csv(path).unwrap();§Arguments:
path: Thepathparameter is aStringthat represents the file path to a CSV file that you want to read from.
§Errors:
- When file is not found, path was not correct
§Returns:
The from_csv function is returning a Result containing either an instance of the struct it
belongs to (represented by Self) or an empty tuple ().
Sourcepub fn from_file(path: String, delimiter: char) -> Result<Self, ()>
pub fn from_file(path: String, delimiter: char) -> Result<Self, ()>
Reads data from a file using the given delimiter, and creates a Dataframe
§Examples
use rustic_ml::data_utils::dataframe::Dataframe;
let path = String::from("./datasets/european_cities.txt");
let dataframe = Dataframe::from_file(path, ' ').unwrap();§Arguments:
path: Thepathparameter is aStringthat represents the file path to a CSV file that you want to read from.- ‘delimiter’: The delimiter that septate records
§Errors:
- When file is not found, path was not correct
§Returns:
The from_csv function is returning a Result containing either an instance of the struct it
belongs to (represented by Self) or an empty tuple ().
pub fn to_csv(&self, _path: String) -> Result<(), ()>
Sourcepub fn column_names(&self) -> Vec<String>
pub fn column_names(&self) -> Vec<String>
Get all the column names for the Dataframe
§Examples
use rustic_ml::data_utils::dataframe::Dataframe;
let path = String::from("./datasets/european_cities.csv");
let dataframe = Dataframe::from_csv(path).unwrap();
let columns = dataframe.column_names();
assert!(columns == vec!["Barcelona","Belgrade","Berlin","Brussels", "Bucharest","Budapest","Copenhagen","Dublin","Hamburg","Istanbul","Kyiv","London","Madrid","Milan","Moscow","Munich","Paris","Prague","Rome","Saint Petersburg","Sofia","Stockholm","Vienna","Warsaw"])
pub fn print(&self)
pub fn print_full_table(&self)
Sourcepub fn rename_column(&mut self, index: usize, column_name: &str)
pub fn rename_column(&mut self, index: usize, column_name: &str)
Rename the column at given index to a new column name
§Example
use rustic_ml::data_utils::dataframe::Dataframe;
let path = String::from("./datasets/european_cities.csv");
let mut dataframe = Dataframe::from_csv(path).unwrap();
assert!(dataframe.has_column("Barcelona"));
assert!(!dataframe.has_column("Oslo"));
dataframe.rename_column(0, "Oslo");
assert!(dataframe.has_column("Oslo"));§Errors
This method does not throw any error. If there is not column at given index, it does nothing. Assume that given column is renamed, if a valid index is given.
Sourcepub fn head(&self)
pub fn head(&self)
Print the first 5 rows of the Dataframe.
If the Dataframe has less then 5 rows, then it prints the whole Dataframe.
Note that current implementation does not take into account the terminal width.
§Examples
use rustic_ml::data_utils::dataframe::Dataframe;
let path = String::from("./datasets/european_cities.csv");
let dataframe = Dataframe::from_csv(path).unwrap();
dataframe.head();§Errors
Does create an error. If the dataframe is empty, then it will print a information string
Sourcepub fn tail(&self)
pub fn tail(&self)
Print the last 5 rows of the Dataframe.
If the Dataframe has less then 5 rows, then it prints the whole Dataframe
Sourcepub fn info(&self)
pub fn info(&self)
Prints information about columns in the Dataframe
Print information about each column. For each column it prints the following information:
- column name
- type
- counts of None
- count of Some values
- Total length of rows.
Sourcepub fn memory_usage(&self) -> usize
pub fn memory_usage(&self) -> usize
Calculate the total memory used for the Dataframe
§Example
use rustic_ml::data_utils::dataframe::Dataframe;
let path = String::from("./datasets/european_cities.csv");
let dataframe = Dataframe::from_csv(path).unwrap();
assert!(dataframe.memory_usage() == 4608);§Returns:
The total memory usage of all columns in the Dataframe in bytes.
Sourcepub fn has_records(&self) -> bool
pub fn has_records(&self) -> bool
Check if the Dataframe any records.
Record is a line with no None values. Use
§Example
use rustic_ml::data_utils::dataframe::Dataframe;
let path = String::from("./datasets/european_cities.csv");
let dataframe = Dataframe::from_csv(path).unwrap();
assert!(dataframe.has_records());§Returns
Returns true if there are rows, that could be None, rows.
Sourcepub fn has_columns(&self) -> bool
pub fn has_columns(&self) -> bool
Check if the Dataframe has columns defined.
§Example
use rustic_ml::data_utils::dataframe::Dataframe;
let path = String::from("./datasets/european_cities.csv");
let dataframe = Dataframe::from_csv(path).unwrap();
assert!(dataframe.has_columns());Returns true if there is at least one DataColumn
Sourcepub fn has_column(&self, column_name: &str) -> bool
pub fn has_column(&self, column_name: &str) -> bool
Check if a column with given column name exists in the Dataframe
§Example
use rustic_ml::data_utils::dataframe::Dataframe;
let path = String::from("./datasets/european_cities.csv");
let dataframe = Dataframe::from_csv(path).unwrap();
assert!(dataframe.has_column("Barcelona"));
assert!(!dataframe.has_column("Oslo"));§Returns
True if there is a column that has the given column name
Sourcepub fn drop_column(&mut self, column_name: &str)
pub fn drop_column(&mut self, column_name: &str)
Drop the column with the given column name
Method is not verbose, and therefor assume that the column was removed, or that it never existed.
§Example
use rustic_ml::data_utils::dataframe::Dataframe;
let path = String::from("./datasets/european_cities.csv");
let mut dataframe = Dataframe::from_csv(path).unwrap();
assert!(dataframe.has_column("Barcelona"));
dataframe.drop_column("Barcelona");
assert!(!dataframe.has_column("Barcelona"));Sourcepub fn add_column<T: ToString>(&mut self, list: Vec<T>, column_name: &str)
pub fn add_column<T: ToString>(&mut self, list: Vec<T>, column_name: &str)
Add a new column to the Dataframe
§Example
use rustic_ml::data_utils::dataframe::Dataframe;
let path = String::from("./datasets/european_cities.csv");
let mut dataframe = Dataframe::from_csv(path).unwrap();
dataframe.add_column(vec![1, 2, 3, 4], "custom_index_column");pub fn add_record(&self)
Sourcepub fn get_column_type(&self, column_name: &str) -> Option<ColumnType>
pub fn get_column_type(&self, column_name: &str) -> Option<ColumnType>
Get the ColumnType for a given column.
§Example
use rustic_ml::data_utils::dataframe::Dataframe;
use rustic_ml::data_utils::dataframe::ColumnType;
let path = String::from("./datasets/european_cities.csv");
let dataframe = Dataframe::from_csv(path).unwrap();
assert!(dataframe.has_column("Barcelona"));
assert!(!dataframe.has_column("Oslo"));
assert!(dataframe.get_column_type("Barcelona") == Some(ColumnType::Float));
assert!(dataframe.get_column_type("Oslo") == None); §Returns
Returns Ǹone if no column had given name, or the ColumnType of the column with the given name.
Sourcepub fn float_feature(&self, column_name: &str) -> Option<Vec<Option<f32>>>
pub fn float_feature(&self, column_name: &str) -> Option<Vec<Option<f32>>>
Extract a single feature of floats into a Vec<Option<f32>>
Creates a clone of the column. Values within the vector might be None. Use the column name to identify the column that will be extracted.
Sourcepub fn float_features(
&self,
first_column_name: &str,
second_column_name: &str,
) -> Option<Vec<Option<(f32, f32)>>>
pub fn float_features( &self, first_column_name: &str, second_column_name: &str, ) -> Option<Vec<Option<(f32, f32)>>>
Extract two sets of features into a single vector of tuples (Vec<Option<(f32, f32)>>).
Creates a clone of the column. Values within the vector might be None.
A row in the vector is None, if one of the vectors are none.
Use the column name to identify the column that will be extracted.
§Returns
Returns a Vec<Option<(f32, f32)>> created from the two features.
Returns None if the two feature vectors are not the same length or of any vector did not exist.
Sourcepub fn at_str(&self, column_name: &str, row_index: usize) -> Option<String>
pub fn at_str(&self, column_name: &str, row_index: usize) -> Option<String>
Get the value at given column and given row index.
§Returns
The value as a String or None if:
- there was no column with that name
- the given row index was out of bounce
- the value at that entry was None