Module anndensity

Source
Expand description

The purpose of this module is to evalate the embedding with structural properties conservation.
We analyze how distances inside blocks of the stable decomposiiton behave after embedding.

We construct ann neighbour graph on embedded data and analyze the distance between neighbours. In particular for each node we consider its nbmax neighbours where nbmax is the min of its degree in the original graph and the number of neighbours the ann has.

We then compute the mean distance between the nodes when neighbour is inside the same block and outside the block. We also compute the distribution of blocks of neighbours and compare it with the distribution in the original graph with Kullbach-Leibler divergence.

The interface to the validation is the function density_analysis that takes as arguments the original graph and the embedded data. Optional arguments can be specified to gain control over the Ann and densitiy decomposition.

Structs§

BlockCheck
This sturcture collects BlockStat statistics for all blocks
BlockStat
Gathers statistics for each block obtained from the Ann flathnsw representation for comparison with those obtained of the original graph representation.
We collect mean distances inside a block and mean distance for edge crossing a block boundary. We also collect transition probabilities between blocks.

Functions§

density_analysis
A tentative assesment of embedding by density comparison edges length after embedding.
embeddedtohnsw
Builds the Hnsw structure from the embedded data In the Hnsw structure original nodes of the graph are identified by their NodeIndex or rank in embedded structure.
(The N type of the graph structure is not used anymore at this step)