Struct hdbscan::HyperParamBuilder
source · pub struct HyperParamBuilder { /* private fields */ }Expand description
Builder object to set custom hyper parameters.
Implementations§
source§impl HyperParamBuilder
impl HyperParamBuilder
sourcepub fn min_cluster_size(self, min_cluster_size: usize) -> HyperParamBuilder
pub fn min_cluster_size(self, min_cluster_size: usize) -> HyperParamBuilder
Sets the minimum cluster size - the minimum number of samples for a group of data points to be considered a cluster. If a grouping of data points has fewer members than this, then they will be considered noise. This should be considered the main hyper parameter for changing the results of clustering. Defaults to 5.
§Parameters
- min_cluster_size - the minimum cluster size
§Returns
- the hyper parameter configuration builder
sourcepub fn max_cluster_size(self, max_cluster_size: usize) -> HyperParamBuilder
pub fn max_cluster_size(self, max_cluster_size: usize) -> HyperParamBuilder
Sets the maximum cluster size - the maximum number of samples for a group of data points to be considered a cluster. If a grouping of data points has more members than this. By default, this value is not considered in clustering.
§Parameters
- max_cluster_size - the maximum cluster size
§Returns
- the hyper parameter configuration builder
sourcepub fn allow_single_cluster(
self,
allow_single_cluster: bool
) -> HyperParamBuilder
pub fn allow_single_cluster( self, allow_single_cluster: bool ) -> HyperParamBuilder
sourcepub fn min_samples(self, min_samples: usize) -> HyperParamBuilder
pub fn min_samples(self, min_samples: usize) -> HyperParamBuilder
Sets min samples. HDBSCAN calculates the core distances between points as a first step in clustering. The core distance is the distance to the Kth neighbour using a nearest neighbours algorithm, where k = min_samples. Defaults to min_cluster_size.
§Parameters
- min_cluster_size - the number of neighbourhood points considered in distances
§Returns
- the hyper parameter configuration builder
sourcepub fn dist_metric(self, dist_metric: DistanceMetric) -> HyperParamBuilder
pub fn dist_metric(self, dist_metric: DistanceMetric) -> HyperParamBuilder
sourcepub fn nn_algorithm(self, nn_algorithm: NnAlgorithm) -> HyperParamBuilder
pub fn nn_algorithm(self, nn_algorithm: NnAlgorithm) -> HyperParamBuilder
Sets the nearest neighbour algorithm. Internally, HDBSCAN calculates a density measure called core distances, which is defined as the distance of a data point to it’s kth (min_samples-th) neighbour. The primary reason for changing this parameter is performance. For example, using BruteForce involves computing a distance matrix between all data points. This works fine on small datasets, however scales poorly to larger ones. Defaults to Auto, whereby the nearest neighbour algorithm will be chosen internally based on size and dimensionality of the input data.
§Returns
- the hyper parameter configuration builder
sourcepub fn build(self) -> HdbscanHyperParams
pub fn build(self) -> HdbscanHyperParams
Finishes the building of the hyper parameter configuration. A call to this method is required to exist the builder pattern and complete the construction of the hyper parameters.
§Returns
- The completed HDBSCAN hyper parameter configuration.