[−][src]Struct linfa_clustering::AppxDbscan

pub struct AppxDbscan;

DBSCAN (Density-based Spatial Clustering of Applications with Noise) clusters together neighbouring points, while points in sparse regions are labelled as noise. Since points may be part of a cluster or noise the transform method returns Array1<Option<usize>>. It should be noted that some "border" points may technically belong to more than one cluster but, since the transform function returns only one label per point (if any), then only one cluster is chosen arbitrarily for those points.

As DBSCAN groups together points in dense regions, the number of clusters is determined by the dataset and distance tolerance not the user.

This is an implementation of the approximated version of DBSCAN with complexity O(N). Additional information can be found in this paper. Beware of the hidden constant O((1/slack)^dimensionality)of the complexity for very small values ofslackand high dimensionalities. The approximated version of the DBSCAN converges to the exact DBSCAN result for aslackthat goes to zero. Since only thetranformmethod is provided and border points are not assigned deterministically, it may happen that the two results still differ (in terms of border and noise points) for very small values ofslack`.

The algorithm

Let d be the the number of features of each point:

The d-dimensional space is divided in a grid of cells, each containing the points of the dataset that fall inside of them.
All points that have at least min_points points in their neighbourhood of size tolerance are labeled as "core points". Every cell containing at least one "core point" is labeled as a "core-cell".
All "core cells" are set as vertexes of a graph and an arc is built between any two "core cells" according to the following rules:
- If there are two points in the two cells that have distance between them at most tolerance then the arc is added to the graph;
- If there are two points in the two cells that have distance between them that is between tolerance and tolerance * (1 + slack) then the arc is added arbitrarily to the graph;
- Otherwise the arc is not added;
Every connected component of the graph is a cluster and every core point in the CC is given the label of that cluster.
For all the non-core points in the cells we search for any neighbouring core point (with the same arbirtary rule used to create the arcs) belonging to a cluster and, if at least one is found, then the non-core point is given the label of the cluster of the core-point. If no such core point is found then the point is a "noise" point and it is given a label of None.

Tutorial

Let's do a walkthrough of an example running the approximated DBSCAN on some data.

use linfa_clustering::{AppxDbscan, generate_blobs};
use linfa::traits::Transformer;
use ndarray::{Axis, array, s};
use ndarray_rand::rand::SeedableRng;
use rand_isaac::Isaac64Rng;
use approx::assert_abs_diff_eq;

// Our random number generator, seeded for reproducibility
let seed = 42;
let mut rng = Isaac64Rng::seed_from_u64(seed);

// `expected_centroids` has shape `(n_centroids, n_features)`
// i.e. three points in the 2-dimensional plane
let expected_centroids = array![[0., 1.], [-10., 20.], [-1., 10.]];
// Let's generate a synthetic dataset: three blobs of observations
// (100 points each) centered around our `expected_centroids`
let observations = generate_blobs(100, &expected_centroids, &mut rng);

// Let's configure and run our AppxDbscan algorithm
// We use the builder pattern to specify the hyperparameters
// `min_points` is the only mandatory parameter.
// If you don't specify the others (e.g. `tolerance`, `slack`)
// default values will be used.
let min_points = 3;
let params = AppxDbscan::params(min_points).tolerance(1e-2).slack(1e-3).build();
// Let's run the algorithm!
let labels = params.transform(&observations);
// Points are `None` if noise `Some(id)` if belonging to a cluster.

Implementations

`impl AppxDbscan`[src]

`pub fn params<F: Float>(min_points: usize) -> AppxDbscanHyperParamsBuilder<F>`[src]

Trait Implementations

`impl Clone for AppxDbscan`[src]

`pub fn clone(&self) -> AppxDbscan`[src]

`pub fn clone_from(&mut self, source: &Self)`1.0.0[src]

`impl Debug for AppxDbscan`[src]

`pub fn fmt(&self, f: &mut Formatter<'_>) -> Result`[src]

`impl PartialEq<AppxDbscan> for AppxDbscan`[src]

`pub fn eq(&self, other: &AppxDbscan) -> bool`[src]

`#[must_use]pub fn ne(&self, other: &Rhs) -> bool`1.0.0[src]

`impl StructuralPartialEq for AppxDbscan`[src]

Auto Trait Implementations

`impl RefUnwindSafe for AppxDbscan`[src]

`impl Send for AppxDbscan`[src]

`impl Sync for AppxDbscan`[src]

`impl Unpin for AppxDbscan`[src]

`impl UnwindSafe for AppxDbscan`[src]

Blanket Implementations

`impl<T> Any for T where T: 'static + ?Sized,` [src]

`pub fn type_id(&self) -> TypeId`[src]

`impl<T> Borrow<T> for T where T: ?Sized,` [src]

`pub fn borrow(&self) -> &T`[src]

`impl<T> BorrowMut<T> for T where T: ?Sized,` [src]

`pub fn borrow_mut(&mut self) -> &mut T`[src]

`impl<T> From<T> for T`[src]

`pub fn from(t: T) -> T`[src]

`impl<T, U> Into for T where U: From<T>,` [src]

`pub fn into(self) -> U`[src]

`impl<T> Pointable for T`

`pub const ALIGN: usize`

`type Init = T`

The type for initializers.

`pub unsafe fn init(init: <T as Pointable>::Init) -> usize`

`pub unsafe fn deref<'a>(ptr: usize) -> &'a T`

`pub unsafe fn deref_mut<'a>(ptr: usize) -> &'a mut T`

`pub unsafe fn drop(ptr: usize)`

`impl<SS, SP> SupersetOf<SS> for SP where SS: SubsetOf<SP>,`

`pub fn to_subset(&self) -> Option<SS>`

`pub fn is_in_subset(&self) -> bool`

`pub unsafe fn to_subset_unchecked(&self) -> SS`

`pub fn from_subset(element: &SS) -> SP`

`impl<T> ToOwned for T where T: Clone,` [src]

`type Owned = T`

The resulting type after obtaining ownership.

`pub fn to_owned(&self) -> T`[src]

`pub fn clone_into(&self, target: &mut T)`[src]

`impl<T, U> TryFrom for T where U: Into<T>,` [src]

`type Error = Infallible`

The type returned in the event of a conversion error.

`pub fn try_from(value: U) -> Result<T, <T as TryFrom>::Error>`[src]

`impl<T, U> TryInto for T where U: TryFrom<T>,` [src]

`type Error = >::Error`

The type returned in the event of a conversion error.

`pub fn try_into(self) -> Result<U, >::Error>`[src]

`impl<V, T> VZip<V> for T where V: MultiLane<T>,`

`pub fn vzip(self) -> V`

[−][src]Struct linfa_clustering::AppxDbscan

The algorithm

Tutorial

Implementations

impl AppxDbscan[src]

pub fn params<F: Float>(min_points: usize) -> AppxDbscanHyperParamsBuilder<F>[src]

Trait Implementations

impl Clone for AppxDbscan[src]

pub fn clone(&self) -> AppxDbscan[src]

pub fn clone_from(&mut self, source: &Self)1.0.0[src]

impl Debug for AppxDbscan[src]

pub fn fmt(&self, f: &mut Formatter<'_>) -> Result[src]

impl PartialEq<AppxDbscan> for AppxDbscan[src]

pub fn eq(&self, other: &AppxDbscan) -> bool[src]

#[must_use]pub fn ne(&self, other: &Rhs) -> bool1.0.0[src]

impl StructuralPartialEq for AppxDbscan[src]

Auto Trait Implementations

impl RefUnwindSafe for AppxDbscan[src]

impl Send for AppxDbscan[src]

impl Sync for AppxDbscan[src]

impl Unpin for AppxDbscan[src]

impl UnwindSafe for AppxDbscan[src]

Blanket Implementations

impl<T> Any for T where T: 'static + ?Sized, [src]

pub fn type_id(&self) -> TypeId[src]

impl<T> Borrow<T> for T where T: ?Sized, [src]

pub fn borrow(&self) -> &T[src]

impl<T> BorrowMut<T> for T where T: ?Sized, [src]

pub fn borrow_mut(&mut self) -> &mut T[src]

impl<T> From<T> for T[src]

pub fn from(t: T) -> T[src]

impl<T, U> Into<U> for T where U: From<T>, [src]

pub fn into(self) -> U[src]

impl<T> Pointable for T

pub const ALIGN: usize

type Init = T

pub unsafe fn init(init: <T as Pointable>::Init) -> usize

pub unsafe fn deref<'a>(ptr: usize) -> &'a T

pub unsafe fn deref_mut<'a>(ptr: usize) -> &'a mut T

pub unsafe fn drop(ptr: usize)

impl<SS, SP> SupersetOf<SS> for SP where SS: SubsetOf<SP>,

pub fn to_subset(&self) -> Option<SS>

pub fn is_in_subset(&self) -> bool

pub unsafe fn to_subset_unchecked(&self) -> SS

pub fn from_subset(element: &SS) -> SP

impl<T> ToOwned for T where T: Clone, [src]

type Owned = T

pub fn to_owned(&self) -> T[src]

pub fn clone_into(&self, target: &mut T)[src]

impl<T, U> TryFrom<U> for T where U: Into<T>, [src]

type Error = Infallible

pub fn try_from(value: U) -> Result<T, <T as TryFrom<U>>::Error>[src]

impl<T, U> TryInto<U> for T where U: TryFrom<T>, [src]

type Error = <U as TryFrom<T>>::Error

pub fn try_into(self) -> Result<U, <U as TryFrom<T>>::Error>[src]

impl<V, T> VZip<V> for T where V: MultiLane<T>,

pub fn vzip(self) -> V