Struct MeanShift

Source
pub struct MeanShift { /* private fields */ }
Expand description

Mean Shift clustering algorithm implementation.

Mean Shift is a centroid-based clustering algorithm that works by iteratively shifting data points towards areas of higher density. Each data point moves in the direction of the mean of points within its current window until convergence. The algorithm does not require specifying the number of clusters in advance.

§Fields

  • bandwidth - The kernel bandwidth parameter that determines the search radius. Larger values lead to fewer clusters.
  • max_iter - Maximum number of iterations to prevent infinite loops.
  • tol - Convergence tolerance threshold. Points are considered converged when they move less than this value.
  • bin_seeding - Whether to use bin seeding strategy for faster algorithm execution.
  • cluster_all - Whether to assign all points to clusters, including potential noise.

§Examples

use rustyml::machine_learning::meanshift::MeanShift;
use ndarray::Array2;

// Create a 2D dataset
let data = Array2::<f64>::from_shape_vec((10, 2),
    vec![1.0, 2.0, 1.1, 2.2, 0.9, 1.9, 1.0, 2.1,
         10.0, 10.0, 10.2, 9.9, 10.1, 10.0, 9.9, 9.8,
         5.0, 5.0, 5.1, 4.9]).unwrap();

// Create a MeanShift instance with default parameters
let mut ms = MeanShift::default();

// Fit the model and predict cluster labels
let labels = ms.fit_predict(&data);

// Get the cluster centers
let centers = ms.get_cluster_centers().unwrap();

§Notes

  • If unsure about an appropriate bandwidth value, use the estimate_bandwidth function.
  • The bandwidth parameter significantly affects algorithm performance and should be chosen carefully based on data characteristics.
  • For large datasets, setting bin_seeding = true can improve performance.

Implementations§

Source§

impl MeanShift

Source

pub fn new( bandwidth: f64, max_iter: Option<usize>, tol: Option<f64>, bin_seeding: Option<bool>, cluster_all: Option<bool>, ) -> Self

Creates a new MeanShift instance with the specified parameters.

§Parameters
  • bandwidth - The bandwidth parameter that determines the size of the kernel.
  • max_iter - The maximum number of iterations for the mean shift algorithm.
  • tol - The convergence threshold for the algorithm.
  • bin_seeding - Whether to use bin seeding for initialization.
  • cluster_all - Whether to assign all points to clusters, even those far from any centroid.
§Returns
  • Self - A new MeanShift instance.
Source

pub fn get_cluster_centers(&self) -> Result<Array2<f64>, ModelError>

Gets the cluster centers found by the algorithm.

§Returns
  • Ok(Array2<f64>) - A Result containing the cluster centers as a ndarray Array2<f64>
  • Err(ModelError::NotFitted) - If the model has not been fitted yet
Source

pub fn get_labels(&self) -> Result<Array1<usize>, ModelError>

Gets the cluster labels assigned to each data point.

§Returns
  • Ok(Array1<usize>) - A Result containing the cluster labels as a ndarray Array1<usize>
  • Err(ModelError::NotFitted) - If the model has not been fitted yet
Source

pub fn get_n_iter(&self) -> Result<usize, ModelError>

Gets the number of iterations the algorithm performed.

§Returns
  • Ok(usize) - A Result containing the number of iterations or an error
  • Err(ModelError::NotFitted) - If the model has not been fitted yet
Source

pub fn get_n_samples_per_center(&self) -> Result<Array1<usize>, ModelError>

Gets the number of samples per cluster center.

§Returns
  • Ok(Array1<usize>) - A Result containing the number of samples per center as a ndarray Array1<usize>
  • Err(ModelError::NotFitted) - If the model has not been fitted yet
Source

pub fn get_bandwidth(&self) -> f64

Gets the bandwidth parameter value.

§Returns
  • f64 - The bandwidth value.
Source

pub fn get_max_iter(&self) -> usize

Gets the maximum number of iterations.

§Returns
  • usize - The maximum number of iterations.
Source

pub fn get_tol(&self) -> f64

Gets the convergence tolerance.

§Returns
  • f64 - The tolerance value.
Source

pub fn get_bin_seeding(&self) -> bool

Gets the bin seeding setting.

§Returns
  • bool - A boolean indicating whether bin seeding is enabled.
Source

pub fn get_cluster_all(&self) -> bool

Gets the cluster_all setting.

§Returns
  • bool - A boolean indicating whether all points are assigned to clusters.
Source

pub fn fit(&mut self, x: &Array2<f64>) -> Result<&mut Self, ModelError>

Fits the MeanShift clustering model to the input data.

§Parameters
  • x - The input data as a ndarray Array2<f64> where each row is a sample.
§Returns
  • Ok(&mut Self) - A mutable reference to the fitted model
  • Err(ModelError::InputValidationError(&str)) - Input does not match expectation
Source

pub fn predict(&self, x: &Array2<f64>) -> Result<Array1<usize>, ModelError>

Predicts cluster labels for the input data.

§Parameters
  • x - The input data as a ndarray Array2<f64> where each row is a sample.
§Returns
  • Ok(Array1<usize>) - containing the predicted cluster labels.
  • Err(ModelError::NotFitted) - If the model has not been fitted yet
Source

pub fn fit_predict( &mut self, x: &Array2<f64>, ) -> Result<Array1<usize>, ModelError>

Fits the model to the input data and predicts cluster labels.

§Parameters
  • x - The input data as a ndarray Array2<f64> where each row is a sample.
§Returns
  • Ok(Array1<usize>) - containing the predicted cluster labels.
  • Err(ModelError::InputValidationError(&str)) - Input does not match expectation

Trait Implementations§

Source§

impl Clone for MeanShift

Source§

fn clone(&self) -> MeanShift

Returns a copy of the value. Read more
1.0.0 · Source§

fn clone_from(&mut self, source: &Self)

Performs copy-assignment from source. Read more
Source§

impl Debug for MeanShift

Source§

fn fmt(&self, f: &mut Formatter<'_>) -> Result

Formats the value using the given formatter. Read more
Source§

impl Default for MeanShift

Source§

fn default() -> Self

Returns the “default value” for a type. Read more

Auto Trait Implementations§

Blanket Implementations§

Source§

impl<T> Any for T
where T: 'static + ?Sized,

Source§

fn type_id(&self) -> TypeId

Gets the TypeId of self. Read more
Source§

impl<T> Borrow<T> for T
where T: ?Sized,

Source§

fn borrow(&self) -> &T

Immutably borrows from an owned value. Read more
Source§

impl<T> BorrowMut<T> for T
where T: ?Sized,

Source§

fn borrow_mut(&mut self) -> &mut T

Mutably borrows from an owned value. Read more
Source§

impl<T> CloneToUninit for T
where T: Clone,

Source§

unsafe fn clone_to_uninit(&self, dest: *mut u8)

🔬This is a nightly-only experimental API. (clone_to_uninit)
Performs copy-assignment from self to dest. Read more
Source§

impl<T> From<T> for T

Source§

fn from(t: T) -> T

Returns the argument unchanged.

Source§

impl<T, U> Into<U> for T
where U: From<T>,

Source§

fn into(self) -> U

Calls U::from(self).

That is, this conversion is whatever the implementation of From<T> for U chooses to do.

Source§

impl<T> ToOwned for T
where T: Clone,

Source§

type Owned = T

The resulting type after obtaining ownership.
Source§

fn to_owned(&self) -> T

Creates owned data from borrowed data, usually by cloning. Read more
Source§

fn clone_into(&self, target: &mut T)

Uses borrowed data to replace owned data, usually by cloning. Read more
Source§

impl<T, U> TryFrom<U> for T
where U: Into<T>,

Source§

type Error = Infallible

The type returned in the event of a conversion error.
Source§

fn try_from(value: U) -> Result<T, <T as TryFrom<U>>::Error>

Performs the conversion.
Source§

impl<T, U> TryInto<U> for T
where U: TryFrom<T>,

Source§

type Error = <U as TryFrom<T>>::Error

The type returned in the event of a conversion error.
Source§

fn try_into(self) -> Result<U, <U as TryFrom<T>>::Error>

Performs the conversion.
Source§

impl<V, T> VZip<V> for T
where V: MultiLane<T>,

Source§

fn vzip(self) -> V