pub struct NodeSelection<ExclusionTags: Hash + Eq, Metadata> { /* private fields */ }Expand description
Provides a way to consistently select some weighted bucket (or node) given some value.
The exact implementation is based on the algorithm described in Schindelhauer and Schomaker “Weighted Distributed Hash Tables”. As a result, modifications of weights (or additions and removals of nodes) can be performed with perfect stability and perfect precision of node weighing; subsequently, only the minimum number of keys are remapped to new nodes.
This node does not perform any sort of caching. As a result, it’s best to use this with some form of cache, whose entries are invalidated when a node is added, removed, or has its weight modified. This is because lookups for a specific node are at best, linear, and at worst, quadratic. To be specific, there is always a linear lookup cost for a given key, determined by the number of nodes present. If nodes utilize tag exclusion, then the cost becomes quadratic to determine if a node should be excluded given a set of tags.
This implementation attempts to be performant and concurrent as possible,
using a DashMap and DashSets to provide concurrent access and
rayon to provide parallel lookup, if the feature is enabled. In single
threaded contexts, care should be used to not deadlock with multiple
references.
§Excluding tags
This is slightly different from a standard weighted hash table as it also
implements the ability to allow nodes to exclude themselves from being
selected. This is provided via the ExclusionTag type, and should be an
enum.
Generally speaking, ExclusionTags must at least Hash and Eq. This
is a requirement of storing the ExclusionTag in a Set implementation.
However, if using rayon to allow for parallel lookup, the trait bounds on
ExclusionTags is restricted to types that are also Send and Sync.
This feature is asymptotically expensive, as this requires checking all nodes to see if they exclude the given tags. As a result, this should be used with care, and at best the number of exclusions should be relatively small per node.
Of course, this feature can be disabled if ExclusionTag is the unit type,
which reduces lookup costs to be linear.
Excluded nodes do not contribute to the total sum of weights. For example, if three nodes exist with equal weights, and one node is excluded, then the probability of selecting a node is the same as if only two nodes were present.
In a more complex case, if nodes with weights 1, 2, and 3 are present, and the node with weight 2 is excluded, then the probability of selecting a node becomes 25% and 75% respectively.
Note that if BitAnd can be performed on ExclusionTags in an efficient
way (such as for bitflags), it is highly recommended to use
BitNodeSelection instead.
§Examples
Nodes can be added and removed at any time and subsequent lookups will be immediately reflect the new state.
use std::num::NonZeroUsize;
let selector = NodeSelection::new();
// Add a node with a weight of 1
let node_1 = Node::<(), _>::new(NonZeroUsize::new(1)?, ());
let node_1_id = selector.add(node_1.clone());
// Lookups will always return the first node, since it's the only one.
let looked_up_node = selector.get(b"hello world")?;
assert_eq!(looked_up_node.value(), &node_1);
// Drop the reference ASAP, see `get` docs for more info.
std::mem::drop(looked_up_node);
// Add a node with a weight of 2
let node_2 = Node::<(), _>::new(NonZeroUsize::new(2)?, ());
selector.add(node_2.clone());
// Any caches should be invalidated now, since we changed the state.
// Now there's a 33% chance of selecting the first node, and a 67% chance of
// selecting the second node.
// Do something with the node...
// Now remove the first node
selector.remove(node_1_id);
// Any caches should be invalidated now, since we changed the state.
// Lookups will now always return the second node, since it's the only one.
let looked_up_node = selector.get(b"hello world")?;
assert_eq!(looked_up_node.value(), &node_2);Modifying a weight is pretty ergonomic, but remember that caches should be invalidated after modifying the weights.
use std::num::NonZeroUsize;
let selector = NodeSelection::new();
// Add a node with a weight of 1
let node_1 = Node::<(), _>::new(NonZeroUsize::new(1)?, ());
let node_1_id = selector.add(node_1.clone());
// Add a node with a weight of 2
let node_2 = Node::<(), _>::new(NonZeroUsize::new(2)?, ());
selector.add(node_2.clone());
// Now there's a 33% chance of selecting the first node, and a 67% chance of
// selecting the second node. Lets modify the weight so there's an equal
// chance of selecting either node.
selector.get_mut(node_1_id)?.set_weight(NonZeroUsize::new(1)?);
// Any caches should be invalidated now.If exclusion tags are used, then excluded nodes do not contribute to the sum weights of nodes.
use std::num::NonZeroUsize;
use std::iter::FromIterator;
use dashmap::DashSet;
let selector = NodeSelection::new();
#[derive(Clone, PartialEq, Eq, Hash)]
enum ExclusionTag {
Foo,
}
// Add a 3 nodes, one with an exclusion tag.
let exclusions = DashSet::from_iter([ExclusionTag::Foo]);
let node_1 = Node::new(NonZeroUsize::new(1)?, ());
selector.add(node_1.clone());
let node_2 = Node::with_exclusions(NonZeroUsize::new(1)?, (), exclusions.clone());
selector.add(node_2.clone());
let node_3 = Node::new(NonZeroUsize::new(1)?, ());
selector.add(node_3.clone());
// There's a 33% chance of selecting any node, if no tags are provided.
selector.get(b"hello world")?;
// There's a equal chance of getting node 1 or node 3 since node 2 is
// excluded.
selector.get_with_exclusions(b"hello world", &exclusions)?;Implementations§
Source§impl<ExclusionTags, Metadata> NodeSelection<ExclusionTags, Metadata>
impl<ExclusionTags, Metadata> NodeSelection<ExclusionTags, Metadata>
Sourcepub fn get(
&self,
item: &[u8],
) -> Option<RefMulti<'_, NodeId, Node<ExclusionTags, Metadata>>>
pub fn get( &self, item: &[u8], ) -> Option<RefMulti<'_, NodeId, Node<ExclusionTags, Metadata>>>
Finds a node that is responsible for the provided item with no
exclusions. This lookup is performed in serial. Consider
par_get that uses rayon to parallelize the
lookup.
§Safety
This method may deadlock if called when holding a mutable reference into the map. Callers must ensure that references are dropped as soon as possible, especially in single threaded contexts.
Sourcepub fn get_with_exclusions(
&self,
item: &[u8],
tags: &DashSet<ExclusionTags>,
) -> Option<RefMulti<'_, NodeId, Node<ExclusionTags, Metadata>>>
pub fn get_with_exclusions( &self, item: &[u8], tags: &DashSet<ExclusionTags>, ) -> Option<RefMulti<'_, NodeId, Node<ExclusionTags, Metadata>>>
Like get, but allows the caller to provide a list of
tags, which nodes that exclude those tags will be skipped.
§Safety
This method may deadlock if called when holding a mutable reference into the map. Callers must ensure that references are dropped as soon as possible, especially in single threaded contexts.
Sourcepub fn get_n(
&self,
item: &[u8],
n: usize,
) -> Vec<RefMulti<'_, NodeId, Node<ExclusionTags, Metadata>>>
pub fn get_n( &self, item: &[u8], n: usize, ) -> Vec<RefMulti<'_, NodeId, Node<ExclusionTags, Metadata>>>
Returns the top n nodes that are responsible for the provided item
with no exclusions. This lookup is performed in serial. Consider
par_get_n that uses rayon to parallelize the lookup. If
n is greater than the number of nodes, then all nodes are returned, in
order of score.
§Safety
This method may deadlock if called when holding a mutable reference into the map. Callers must ensure that references are dropped as soon as possible, especially in single threaded contexts.
Sourcepub fn get_n_with_exclusions(
&self,
item: &[u8],
n: usize,
tags: &DashSet<ExclusionTags>,
) -> Vec<RefMulti<'_, NodeId, Node<ExclusionTags, Metadata>>>
pub fn get_n_with_exclusions( &self, item: &[u8], n: usize, tags: &DashSet<ExclusionTags>, ) -> Vec<RefMulti<'_, NodeId, Node<ExclusionTags, Metadata>>>
Like get_n, but allows the caller to provide a list of
tags, which nodes that exclude those tags will be skipped.
§Safety
This method may deadlock if called when holding a mutable reference into the map. Callers must ensure that references are dropped as soon as possible, especially in single threaded contexts.
Source§impl<ExclusionTags, Metadata> NodeSelection<ExclusionTags, Metadata>
impl<ExclusionTags, Metadata> NodeSelection<ExclusionTags, Metadata>
Sourcepub fn with_capacity(capacity: usize) -> Self
pub fn with_capacity(capacity: usize) -> Self
Constructs a new $struct_name, ensuring that at least
capacity nodes can be added without re-allocating.
Sourcepub fn add(&self, node: Node<ExclusionTags, Metadata>) -> NodeId
pub fn add(&self, node: Node<ExclusionTags, Metadata>) -> NodeId
Adds a new node to the selection with an opaque ID. If lookups are cached, then callers must invalidate the cache after adding a node. Otherwise, it is possible to have incorrectly distributed selections.
§Safety
This method may deadlock if called when holding any sort of reference into the map. Callers must ensure that references are dropped as soon as possible, especially in single threaded contexts.
§Panics
This method will panic if the provided node has an id that is
already present in the selection. For the non-panicking version
of this method, see try_add_node.
Sourcepub fn add_with_id(&self, id: NodeId, node: Node<ExclusionTags, Metadata>)
pub fn add_with_id(&self, id: NodeId, node: Node<ExclusionTags, Metadata>)
Adds a new node to the selection with the provided ID. If lookups are cached, then callers must invalidate the cache after adding a node. Otherwise, it is possible to have incorrectly distributed selections.
§Safety
This method may deadlock if called when holding any sort of reference into the map. Callers must ensure that references are dropped as soon as possible, especially in single threaded contexts.
§Panics
This method will panic if the provided node has an id that is
already present in the selection. For the non-panicking version
of this method, see try_add_node.
Sourcepub fn try_add(
&self,
node: Node<ExclusionTags, Metadata>,
) -> Result<NodeId, DuplicateIdError>
pub fn try_add( &self, node: Node<ExclusionTags, Metadata>, ) -> Result<NodeId, DuplicateIdError>
Tries to add the node to the selection with an opaque ID. If lookups are cached, then callers must invalidate the cache after adding a node. Otherwise, it is possible to have incorrectly distributed selections.
§Safety
This method may deadlock if called when holding any sort of reference into the map. Callers must ensure that references are dropped as soon as possible, especially in single threaded contexts.
§Errors
This method will fail if the node has an id that is already present in the selection.
Sourcepub fn try_add_with_id(
&self,
id: NodeId,
node: Node<ExclusionTags, Metadata>,
) -> Result<(), DuplicateIdError>
pub fn try_add_with_id( &self, id: NodeId, node: Node<ExclusionTags, Metadata>, ) -> Result<(), DuplicateIdError>
Tries to add the node to the selection with the provided ID. If lookups are cached, then callers must invalidate the cache after adding a node. Otherwise, it is possible to have incorrectly distributed selections.
§Safety
This method may deadlock if called when holding any sort of reference into the map. Callers must ensure that references are dropped as soon as possible, especially in single threaded contexts.
§Errors
This method will fail if the node has an id that is already present in the selection.
Sourcepub fn get_mut(
&self,
id: NodeId,
) -> Option<RefMut<'_, NodeId, Node<ExclusionTags, Metadata>>>
pub fn get_mut( &self, id: NodeId, ) -> Option<RefMut<'_, NodeId, Node<ExclusionTags, Metadata>>>
Returns a mutable reference to the node given some ID. This is useful for modifying the weight of a node. If lookups are cached, then callers must invalidate the cache after adding a node. Otherwise, it is possible to have incorrectly distributed selections.
§Safety
This method may deadlock if called when holding any sort of reference into the map. Callers must ensure that references are dropped as soon as possible, especially in single threaded contexts.
Sourcepub fn remove(&self, id: NodeId) -> Option<Node<ExclusionTags, Metadata>>
pub fn remove(&self, id: NodeId) -> Option<Node<ExclusionTags, Metadata>>
Removes a node, returning the node if it existed. If lookups are cached, then callers must invalidate the cache after successfully removing a node. Otherwise, it is possible to have incorrectly distributed selections.
§Safety
This method may deadlock if called when holding any sort of reference into the map. Callers must ensure that references are dropped as soon as possible, especially in single threaded contexts.
Sourcepub fn contains(&self, id: NodeId) -> bool
pub fn contains(&self, id: NodeId) -> bool
Returns if the node is selectable from this selector.
§Safety
This method may deadlock if called when holding a mutable reference into the map. Callers must ensure that references are dropped as soon as possible, especially in single threaded contexts.
Trait Implementations§
Source§impl<ExclusionTags: Clone + Hash + Eq, Metadata: Clone> Clone for NodeSelection<ExclusionTags, Metadata>
impl<ExclusionTags: Clone + Hash + Eq, Metadata: Clone> Clone for NodeSelection<ExclusionTags, Metadata>
Source§fn clone(&self) -> NodeSelection<ExclusionTags, Metadata>
fn clone(&self) -> NodeSelection<ExclusionTags, Metadata>
1.0.0 · Source§fn clone_from(&mut self, source: &Self)
fn clone_from(&mut self, source: &Self)
source. Read moreSource§impl<ExclusionTags: Debug + Hash + Eq, Metadata: Debug> Debug for NodeSelection<ExclusionTags, Metadata>
impl<ExclusionTags: Debug + Hash + Eq, Metadata: Debug> Debug for NodeSelection<ExclusionTags, Metadata>
Source§impl<ExclusionTags, Metadata> Default for NodeSelection<ExclusionTags, Metadata>
impl<ExclusionTags, Metadata> Default for NodeSelection<ExclusionTags, Metadata>
Source§impl<ExclusionTags, Metadata> Extend<(NodeId, Node<ExclusionTags, Metadata>)> for NodeSelection<ExclusionTags, Metadata>
impl<ExclusionTags, Metadata> Extend<(NodeId, Node<ExclusionTags, Metadata>)> for NodeSelection<ExclusionTags, Metadata>
Source§fn extend<T: IntoIterator<Item = (NodeId, Node<ExclusionTags, Metadata>)>>(
&mut self,
iter: T,
)
fn extend<T: IntoIterator<Item = (NodeId, Node<ExclusionTags, Metadata>)>>( &mut self, iter: T, )
Source§fn extend_one(&mut self, item: A)
fn extend_one(&mut self, item: A)
extend_one)Source§fn extend_reserve(&mut self, additional: usize)
fn extend_reserve(&mut self, additional: usize)
extend_one)