pub struct Optics<P: Points>{ /* private fields */ }Expand description
Clustering via the OPTICS algorithm[1].
[OPTICS] is an algorithm for finding density-based clusters in spatial data. […] Its basic idea is similar to DBSCAN, but it addresses one of DBSCAN’s major weaknesses: the problem of detecting meaningful clusters in data of varying density.wikipedia
An instance of Optics represents the dendrogram that OPTICS
computes for a data set. Once computed, this dendrogram can then
be queried for clustering structure, for example, a clustering
similar to the one that would be computed by DBSCAN can be
retrieved with dbscan_clustering.
This uses the P::Point yielded by the iterators provided by
ListPoints and RegionQuery as a unique identifier for each
point. The algorithm will behave strangely if the identifier is
not unique or not stable within a given execution of OPTICS. The
identifier is cloned several times in the course of execution, so
it should be cheap to duplicate (e.g. a usize index, or a &T
reference).
[1]: Mihael Ankerst, Markus M. Breunig, Hans-Peter Kriegel, Jörg Sander (1999). OPTICS: Ordering Points To Identify the Clustering Structure. ACM SIGMOD international conference on Management of data. ACM Press. pp. 49–60.
§Examples
A basic example:
use cogset::{Optics, BruteScan, Euclid};
let points = [Euclid([0.1]), Euclid([0.2]), Euclid([1.0])];
let scanner = BruteScan::new(&points);
let optics = Optics::new(scanner, 0.2, 2);
// use the same epsilon that OPTICS used
let mut clustering = optics.dbscan_clustering(0.2);
// get the clusters themselves
let mut clusters = clustering.by_ref().collect::<Vec<_>>();
// the first two points are the only cluster
assert_eq!(clusters, &[&[0, 1]]);
// now the noise, which is just the last point
assert_eq!(clustering.noise_points(), &[2]);
// cluster again, with a much smaller epsilon
let mut clustering = optics.dbscan_clustering(0.05);
// get the clusters themselves
let mut clusters = clustering.by_ref().collect::<Vec<_>>();
// no clusters (its less than the smallest distance between points)
assert!(clusters.is_empty());
// everything is noise
assert_eq!(clustering.noise_points(), &[0, 1, 2]);Implementations§
Source§impl<P: RegionQuery + ListPoints> Optics<P>
impl<P: RegionQuery + ListPoints> Optics<P>
Sourcepub fn new(points: P, eps: f64, min_pts: usize) -> Optics<P>
pub fn new(points: P, eps: f64, min_pts: usize) -> Optics<P>
Run the OPTICS algorithm on the index points, with the eps
and min_pts parameters.
The return value can be queried for the actual clustering
structure using, for example, dbscan_clustering. The
parameter eps is used as a performance enhancement, and
should be made as small as possible for the use-case.
NB. this computes the clustering dendrogram immediately,
unlike Dbscan’s laziness.
Sourcepub fn dbscan_clustering<'a>(
&'a self,
eps: f64,
) -> OpticsDbscanClustering<'a, P> ⓘ
pub fn dbscan_clustering<'a>( &'a self, eps: f64, ) -> OpticsDbscanClustering<'a, P> ⓘ
Extract a clustering like one that DBSCAN would give.
The returned type is similar to the Dbscan type: an iterator
over the clusters (as vectors of points), along with the
noise_points method to retrieve unclustered points.
§Panics
eps must be less than the eps passed to new.