pub struct MeanShift { /* private fields */ }Expand description
Mean Shift clustering algorithm implementation.
Mean Shift is a centroid-based clustering algorithm that works by iteratively shifting data points towards areas of higher density. Each data point moves in the direction of the mean of points within its current window until convergence. The algorithm does not require specifying the number of clusters in advance.
§Fields
bandwidth- The kernel bandwidth parameter that determines the search radius. Larger values lead to fewer clusters.max_iter- Maximum number of iterations to prevent infinite loops.tol- Convergence tolerance threshold. Points are considered converged when they move less than this value.bin_seeding- Whether to use bin seeding strategy for faster algorithm execution.cluster_all- Whether to assign all points to clusters, including potential noise.
§Examples
use rustyml::machine_learning::meanshift::MeanShift;
use ndarray::Array2;
// Create a 2D dataset
let data = Array2::<f64>::from_shape_vec((10, 2),
vec![1.0, 2.0, 1.1, 2.2, 0.9, 1.9, 1.0, 2.1,
10.0, 10.0, 10.2, 9.9, 10.1, 10.0, 9.9, 9.8,
5.0, 5.0, 5.1, 4.9]).unwrap();
// Create a MeanShift instance with default parameters
let mut ms = MeanShift::default();
// Fit the model and predict cluster labels
let labels = ms.fit_predict(&data);
// Get the cluster centers
let centers = ms.get_cluster_centers().unwrap();§Notes
- If unsure about an appropriate bandwidth value, use the
estimate_bandwidthfunction. - The bandwidth parameter significantly affects algorithm performance and should be chosen carefully based on data characteristics.
- For large datasets, setting
bin_seeding = truecan improve performance.
Implementations§
Source§impl MeanShift
impl MeanShift
Sourcepub fn new(
bandwidth: f64,
max_iter: Option<usize>,
tol: Option<f64>,
bin_seeding: Option<bool>,
cluster_all: Option<bool>,
) -> Self
pub fn new( bandwidth: f64, max_iter: Option<usize>, tol: Option<f64>, bin_seeding: Option<bool>, cluster_all: Option<bool>, ) -> Self
Creates a new MeanShift instance with the specified parameters.
§Parameters
bandwidth- The bandwidth parameter that determines the size of the kernel.max_iter- The maximum number of iterations for the mean shift algorithm.tol- The convergence threshold for the algorithm.bin_seeding- Whether to use bin seeding for initialization.cluster_all- Whether to assign all points to clusters, even those far from any centroid.
§Returns
Self- A new MeanShift instance.
Sourcepub fn get_cluster_centers(&self) -> Result<Array2<f64>, ModelError>
pub fn get_cluster_centers(&self) -> Result<Array2<f64>, ModelError>
Gets the cluster centers found by the algorithm.
§Returns
Ok(Array2<f64>)- A Result containing the cluster centers as a ndarrayArray2<f64>
Err(ModelError::NotFitted)- If the model has not been fitted yet
Sourcepub fn get_labels(&self) -> Result<Array1<usize>, ModelError>
pub fn get_labels(&self) -> Result<Array1<usize>, ModelError>
Gets the cluster labels assigned to each data point.
§Returns
Ok(Array1<usize>)- A Result containing the cluster labels as a ndarrayArray1<usize>Err(ModelError::NotFitted)- If the model has not been fitted yet
Sourcepub fn get_n_iter(&self) -> Result<usize, ModelError>
pub fn get_n_iter(&self) -> Result<usize, ModelError>
Gets the number of iterations the algorithm performed.
§Returns
Ok(usize)- A Result containing the number of iterations or an errorErr(ModelError::NotFitted)- If the model has not been fitted yet
Sourcepub fn get_n_samples_per_center(&self) -> Result<Array1<usize>, ModelError>
pub fn get_n_samples_per_center(&self) -> Result<Array1<usize>, ModelError>
Gets the number of samples per cluster center.
§Returns
Ok(Array1<usize>)- A Result containing the number of samples per center as a ndarrayArray1<usize>Err(ModelError::NotFitted)- If the model has not been fitted yet
Sourcepub fn get_bandwidth(&self) -> f64
pub fn get_bandwidth(&self) -> f64
Sourcepub fn get_max_iter(&self) -> usize
pub fn get_max_iter(&self) -> usize
Sourcepub fn get_bin_seeding(&self) -> bool
pub fn get_bin_seeding(&self) -> bool
Sourcepub fn get_cluster_all(&self) -> bool
pub fn get_cluster_all(&self) -> bool
Gets the cluster_all setting.
§Returns
bool- A boolean indicating whether all points are assigned to clusters.
Sourcepub fn fit_predict(
&mut self,
x: &Array2<f64>,
) -> Result<Array1<usize>, ModelError>
pub fn fit_predict( &mut self, x: &Array2<f64>, ) -> Result<Array1<usize>, ModelError>
Trait Implementations§
Auto Trait Implementations§
impl Freeze for MeanShift
impl RefUnwindSafe for MeanShift
impl Send for MeanShift
impl Sync for MeanShift
impl Unpin for MeanShift
impl UnwindSafe for MeanShift
Blanket Implementations§
Source§impl<T> BorrowMut<T> for Twhere
T: ?Sized,
impl<T> BorrowMut<T> for Twhere
T: ?Sized,
Source§fn borrow_mut(&mut self) -> &mut T
fn borrow_mut(&mut self) -> &mut T
Mutably borrows from an owned value. Read more