pub struct EmbedderParams {
pub asked_dim: usize,
pub dmap_init: bool,
pub beta: f64,
pub b: f64,
pub scale_rho: f64,
pub grad_step: f64,
pub nb_sampling_by_edge: usize,
pub nb_grad_batch: usize,
pub grad_factor: usize,
pub hierarchy_layer: usize,
pub hubness_weighting: bool,
}Expand description
It is necessary to describe briefly the model used in the embedding:
§Definition of the weight of an edge of the graph to embed
First we define the local scale $\rho$ around a point.
It is defined as the mean of distances of points to their nearest neighbour.
The points taken into account to define $\rho$ are the node we consider and
all its knbn neighbours. So we compute the mean of distances to nearest neighbours
on knbn + 1 points around current point.
let ($d_{i}$) be the sorted distances in increasing order of neighbours for i=0..k of a node n, $$w_{i} = \exp\left(- \left(\frac{d_{i} - d_{0}}{S * \rho}\right)^{\beta} \right)$$
S is a scale factor modulating $\rho$. After that weights are normalized to a probability distribution.
So before normalization $w_{0}$ is always equal to 1. Augmenting β to 2. makes the weight $w_{i}$ decrease faster. The least weight of an edge must not go under $10^{-5}$ to limit the range of weight and avoid Svd numeric difficulties. The code stops with an error in this case. So after normalization the range of weights from $w_{0}$ to $w_{k}$ is larger. Reducing S as similar effect but playing with both $\beta$ and the scale adjustment must not violate the range constraint on weights.
It must be noted that setting the scale as described before and renormalizing to get a probability distribution
gives a perplexity nearly equal to the number of neighbours.
This can be verified by using the logging (implemented using the crates env_logger and log) and setting
RUST_LOG=annembed=INFO in your environment.
Then quantile summaries are given for the distributions of edge distances, edge weights, and perplexity
of nodes. This helps adjusting parameters β, Scale and show their impact on these quantiles.
Default value :
$\beta = 1$ so that we have exponential weights similar to Umap.
$S = 0.5$
But it is possible to set β to 2. to get more gaussian weight or reduce to 0.5 and adjust S to respect the constraints on edge weights.
§Definition of the weight of an edge of the embedded graph
The embedded edge has the usual expression : $$ w(x,y) = \frac{1}{1+ || \left((x - y)/a_{x} \right)||^{2*b} } $$
by default b = 1. The coefficient $a_{x}$ is deduced from the scale coefficient in the original space with some restriction to avoid too large fluctuations.
- Initial step of the gradient and number of batches
A number of batch for the Mnist digits data around 10-20 seems sufficient.
The initial gradient step $\gamma_{0}$ can be chosen around 1. (in the range 1/5 … 5.).
Reasonably it should satisfy nb_batch $ * \gamma_{0} < 1 $
- asked_dimension : default is set to 2.
§The optimization of the embedding
The embedding is optimized by minimizing the (Shannon at present time) cross entropy between distribution of original and embedded weight of edges. This minimization is done by a standard (multithreaded) stochastic gradient with negative sampling for the unobserved edges (see Mnih-Teh or Mikolov)
The number of negative edge sampling is set to a fixed value 5.
- expression of the gradient
here are the main parameters driving Embeding
Fields§
§asked_dim: usizeembedding dimension : default to 2
dmap_init: booldefines if embedder is initialized by a diffusion map step. default to true
beta: f64exponent used in defining edge weight in original graph. 0.5 or 1.
b: f64exponenent used in embedded space, default 1.
scale_rho: f64embedded scale factor. default to 1.
grad_step: f64initial gradient step , default to 2.
nb_sampling_by_edge: usizenb sampling by edge in gradient step. default = 10
nb_grad_batch: usizenumber of gradient batch. default to 15
grad_factor: usizethe number of gradient batch in hierarchical case is nb_grad_batch multiplied by grad_factor. As the first iterations run on few points we can do more iterations. Default is 4.
hierarchy_layer: usizeif layer > 0 means we have hierarchical initialization
hubness_weighting: boolTo do negative sampling of nodes using hubness weights as node distribution, set it to true.
Default is false.
It improves slightly the quality estimated by quality estimator
Implementations§
Source§impl EmbedderParams
impl EmbedderParams
pub fn default() -> Self
pub fn log(&self)
Sourcepub fn set_dmap_init(&mut self, val: bool)
pub fn set_dmap_init(&mut self, val: bool)
set to false if random initialization is preferred
Sourcepub fn set_nb_gradient_batch(&mut self, nb_batch: usize)
pub fn set_nb_gradient_batch(&mut self, nb_batch: usize)
set the number of gradient batch. At each batch each edge is sampled nb_sampling_by_edge times. default to 20
Sourcepub fn set_nb_edge_sampling(&mut self, nb_sample_by_edge: usize)
pub fn set_nb_edge_sampling(&mut self, nb_sample_by_edge: usize)
sets the number of time each edge should be sampled in a gradient batch. Default to 10
Sourcepub fn get_dimension(&self) -> usize
pub fn get_dimension(&self) -> usize
get asked embedding dimension
pub fn set_hierarchy_layer(&mut self, layer: usize)
pub fn get_hierarchy_layer(&self) -> usize
Trait Implementations§
Source§impl Clone for EmbedderParams
impl Clone for EmbedderParams
Source§fn clone(&self) -> EmbedderParams
fn clone(&self) -> EmbedderParams
1.0.0 (const: unstable) · Source§fn clone_from(&mut self, source: &Self)
fn clone_from(&mut self, source: &Self)
source. Read moreimpl Copy for EmbedderParams
Auto Trait Implementations§
impl Freeze for EmbedderParams
impl RefUnwindSafe for EmbedderParams
impl Send for EmbedderParams
impl Sync for EmbedderParams
impl Unpin for EmbedderParams
impl UnsafeUnpin for EmbedderParams
impl UnwindSafe for EmbedderParams
Blanket Implementations§
Source§impl<T> BorrowMut<T> for Twhere
T: ?Sized,
impl<T> BorrowMut<T> for Twhere
T: ?Sized,
Source§fn borrow_mut(&mut self) -> &mut T
fn borrow_mut(&mut self) -> &mut T
Source§impl<T> CloneToUninit for Twhere
T: Clone,
impl<T> CloneToUninit for Twhere
T: Clone,
Source§impl<T> FmtForward for T
impl<T> FmtForward for T
Source§fn fmt_binary(self) -> FmtBinary<Self>where
Self: Binary,
fn fmt_binary(self) -> FmtBinary<Self>where
Self: Binary,
self to use its Binary implementation when Debug-formatted.Source§fn fmt_display(self) -> FmtDisplay<Self>where
Self: Display,
fn fmt_display(self) -> FmtDisplay<Self>where
Self: Display,
self to use its Display implementation when
Debug-formatted.Source§fn fmt_lower_exp(self) -> FmtLowerExp<Self>where
Self: LowerExp,
fn fmt_lower_exp(self) -> FmtLowerExp<Self>where
Self: LowerExp,
self to use its LowerExp implementation when
Debug-formatted.Source§fn fmt_lower_hex(self) -> FmtLowerHex<Self>where
Self: LowerHex,
fn fmt_lower_hex(self) -> FmtLowerHex<Self>where
Self: LowerHex,
self to use its LowerHex implementation when
Debug-formatted.Source§fn fmt_octal(self) -> FmtOctal<Self>where
Self: Octal,
fn fmt_octal(self) -> FmtOctal<Self>where
Self: Octal,
self to use its Octal implementation when Debug-formatted.Source§fn fmt_pointer(self) -> FmtPointer<Self>where
Self: Pointer,
fn fmt_pointer(self) -> FmtPointer<Self>where
Self: Pointer,
self to use its Pointer implementation when
Debug-formatted.Source§fn fmt_upper_exp(self) -> FmtUpperExp<Self>where
Self: UpperExp,
fn fmt_upper_exp(self) -> FmtUpperExp<Self>where
Self: UpperExp,
self to use its UpperExp implementation when
Debug-formatted.Source§fn fmt_upper_hex(self) -> FmtUpperHex<Self>where
Self: UpperHex,
fn fmt_upper_hex(self) -> FmtUpperHex<Self>where
Self: UpperHex,
self to use its UpperHex implementation when
Debug-formatted.Source§impl<T> IntoEither for T
impl<T> IntoEither for T
Source§fn into_either(self, into_left: bool) -> Either<Self, Self>
fn into_either(self, into_left: bool) -> Either<Self, Self>
self into a Left variant of Either<Self, Self>
if into_left is true.
Converts self into a Right variant of Either<Self, Self>
otherwise. Read moreSource§fn into_either_with<F>(self, into_left: F) -> Either<Self, Self>
fn into_either_with<F>(self, into_left: F) -> Either<Self, Self>
self into a Left variant of Either<Self, Self>
if into_left(&self) returns true.
Converts self into a Right variant of Either<Self, Self>
otherwise. Read moreSource§impl<T> Pipe for Twhere
T: ?Sized,
impl<T> Pipe for Twhere
T: ?Sized,
Source§fn pipe<R>(self, func: impl FnOnce(Self) -> R) -> Rwhere
Self: Sized,
fn pipe<R>(self, func: impl FnOnce(Self) -> R) -> Rwhere
Self: Sized,
Source§fn pipe_ref<'a, R>(&'a self, func: impl FnOnce(&'a Self) -> R) -> Rwhere
R: 'a,
fn pipe_ref<'a, R>(&'a self, func: impl FnOnce(&'a Self) -> R) -> Rwhere
R: 'a,
self and passes that borrow into the pipe function. Read moreSource§fn pipe_ref_mut<'a, R>(&'a mut self, func: impl FnOnce(&'a mut Self) -> R) -> Rwhere
R: 'a,
fn pipe_ref_mut<'a, R>(&'a mut self, func: impl FnOnce(&'a mut Self) -> R) -> Rwhere
R: 'a,
self and passes that borrow into the pipe function. Read moreSource§fn pipe_borrow<'a, B, R>(&'a self, func: impl FnOnce(&'a B) -> R) -> R
fn pipe_borrow<'a, B, R>(&'a self, func: impl FnOnce(&'a B) -> R) -> R
Source§fn pipe_borrow_mut<'a, B, R>(
&'a mut self,
func: impl FnOnce(&'a mut B) -> R,
) -> R
fn pipe_borrow_mut<'a, B, R>( &'a mut self, func: impl FnOnce(&'a mut B) -> R, ) -> R
Source§fn pipe_as_ref<'a, U, R>(&'a self, func: impl FnOnce(&'a U) -> R) -> R
fn pipe_as_ref<'a, U, R>(&'a self, func: impl FnOnce(&'a U) -> R) -> R
self, then passes self.as_ref() into the pipe function.Source§fn pipe_as_mut<'a, U, R>(&'a mut self, func: impl FnOnce(&'a mut U) -> R) -> R
fn pipe_as_mut<'a, U, R>(&'a mut self, func: impl FnOnce(&'a mut U) -> R) -> R
self, then passes self.as_mut() into the pipe
function.Source§fn pipe_deref<'a, T, R>(&'a self, func: impl FnOnce(&'a T) -> R) -> R
fn pipe_deref<'a, T, R>(&'a self, func: impl FnOnce(&'a T) -> R) -> R
self, then passes self.deref() into the pipe function.Source§impl<T> Pointable for T
impl<T> Pointable for T
Source§impl<SS, SP> SupersetOf<SS> for SPwhere
SS: SubsetOf<SP>,
impl<SS, SP> SupersetOf<SS> for SPwhere
SS: SubsetOf<SP>,
Source§fn to_subset(&self) -> Option<SS>
fn to_subset(&self) -> Option<SS>
self from the equivalent element of its
superset. Read moreSource§fn is_in_subset(&self) -> bool
fn is_in_subset(&self) -> bool
self is actually part of its subset T (and can be converted to it).Source§unsafe fn to_subset_unchecked(&self) -> SS
unsafe fn to_subset_unchecked(&self) -> SS
self.to_subset but without any property checks. Always succeeds.Source§fn from_subset(element: &SS) -> SP
fn from_subset(element: &SS) -> SP
self to the equivalent element of its superset.Source§impl<T> Tap for T
impl<T> Tap for T
Source§fn tap_borrow<B>(self, func: impl FnOnce(&B)) -> Self
fn tap_borrow<B>(self, func: impl FnOnce(&B)) -> Self
Borrow<B> of a value. Read moreSource§fn tap_borrow_mut<B>(self, func: impl FnOnce(&mut B)) -> Self
fn tap_borrow_mut<B>(self, func: impl FnOnce(&mut B)) -> Self
BorrowMut<B> of a value. Read moreSource§fn tap_ref<R>(self, func: impl FnOnce(&R)) -> Self
fn tap_ref<R>(self, func: impl FnOnce(&R)) -> Self
AsRef<R> view of a value. Read moreSource§fn tap_ref_mut<R>(self, func: impl FnOnce(&mut R)) -> Self
fn tap_ref_mut<R>(self, func: impl FnOnce(&mut R)) -> Self
AsMut<R> view of a value. Read moreSource§fn tap_deref<T>(self, func: impl FnOnce(&T)) -> Self
fn tap_deref<T>(self, func: impl FnOnce(&T)) -> Self
Deref::Target of a value. Read moreSource§fn tap_deref_mut<T>(self, func: impl FnOnce(&mut T)) -> Self
fn tap_deref_mut<T>(self, func: impl FnOnce(&mut T)) -> Self
Deref::Target of a value. Read moreSource§fn tap_dbg(self, func: impl FnOnce(&Self)) -> Self
fn tap_dbg(self, func: impl FnOnce(&Self)) -> Self
.tap() only in debug builds, and is erased in release builds.Source§fn tap_mut_dbg(self, func: impl FnOnce(&mut Self)) -> Self
fn tap_mut_dbg(self, func: impl FnOnce(&mut Self)) -> Self
.tap_mut() only in debug builds, and is erased in release
builds.Source§fn tap_borrow_dbg<B>(self, func: impl FnOnce(&B)) -> Self
fn tap_borrow_dbg<B>(self, func: impl FnOnce(&B)) -> Self
.tap_borrow() only in debug builds, and is erased in release
builds.Source§fn tap_borrow_mut_dbg<B>(self, func: impl FnOnce(&mut B)) -> Self
fn tap_borrow_mut_dbg<B>(self, func: impl FnOnce(&mut B)) -> Self
.tap_borrow_mut() only in debug builds, and is erased in release
builds.Source§fn tap_ref_dbg<R>(self, func: impl FnOnce(&R)) -> Self
fn tap_ref_dbg<R>(self, func: impl FnOnce(&R)) -> Self
.tap_ref() only in debug builds, and is erased in release
builds.Source§fn tap_ref_mut_dbg<R>(self, func: impl FnOnce(&mut R)) -> Self
fn tap_ref_mut_dbg<R>(self, func: impl FnOnce(&mut R)) -> Self
.tap_ref_mut() only in debug builds, and is erased in release
builds.Source§fn tap_deref_dbg<T>(self, func: impl FnOnce(&T)) -> Self
fn tap_deref_dbg<T>(self, func: impl FnOnce(&T)) -> Self
.tap_deref() only in debug builds, and is erased in release
builds.