pub struct ArrowSpaceBuilder {
pub prebuilt_spectral: bool,
pub sampling: Option<SamplerType>,
/* private fields */
}Fields§
§prebuilt_spectral: bool§sampling: Option<SamplerType>Implementations§
Source§impl ArrowSpaceBuilder
impl ArrowSpaceBuilder
pub fn new() -> Self
Sourcepub fn with_lambda_graph(
self,
eps: f64,
k: usize,
topk: usize,
p: f64,
sigma_override: Option<f64>,
) -> Self
pub fn with_lambda_graph( self, eps: f64, k: usize, topk: usize, p: f64, sigma_override: Option<f64>, ) -> Self
Use this to pass λτ-graph parameters. If not called, use defaults Configure the base λτ-graph to be built from the provided data matrix:
- eps: threshold for |Δλ| on items
- k: optional cap on neighbors per item
- p: weight kernel exponent
- sigma_override: optional scale σ for the kernel (default = eps)
Sourcepub fn with_synthesis(self, tau_mode: TauMode) -> Self
pub fn with_synthesis(self, tau_mode: TauMode) -> Self
Optional: override the default tau policy or tau for synthetic index.
pub fn with_normalisation(self, normalise: bool) -> Self
Sourcepub fn with_spectral(self, compute_spectral: bool) -> Self
pub fn with_spectral(self, compute_spectral: bool) -> Self
Optional define if building spectral matrix at building time This is expensive as requires twice laplacian computation use only on limited dataset for analysis, exploration and data QA
pub fn with_sparsity_check(self, sparsity_check: bool) -> Self
pub fn with_inline_sampling(self, sampling: Option<SamplerType>) -> Self
pub fn with_dims_reduction(self, enable: bool, eps: Option<f64>) -> Self
Sourcepub fn with_seed(self, seed: u64) -> Self
pub fn with_seed(self, seed: u64) -> Self
Set a custom seed for deterministic clustering. Enable sequential (deterministic) clustering. This ensures reproducible results at the cost of parallelization.
Sourcepub fn build(self, rows: Vec<Vec<f64>>) -> (ArrowSpace, GraphLaplacian)
pub fn build(self, rows: Vec<Vec<f64>>) -> (ArrowSpace, GraphLaplacian)
Build the ArrowSpace and the selected Laplacian (if any).
Priority order for graph selection:
- prebuilt Laplacian (if provided)
- hypergraph clique/normalized (if provided)
- fallback: λτ-graph-from-data (with_lambda_graph config or defaults)
Behavior:
- If fallback (#3) is selected, synthetic lambdas are always computed using TauMode::Median unless with_synthesis was called, in which case the provided tau_mode and alpha are used.
- If prebuilt or hypergraph graph is selected, standard Rayleigh lambdas are computed unless with_synthesis was called, in which case synthetic lambdas are computed on that graph.
Source§impl ArrowSpaceBuilder
impl ArrowSpaceBuilder
pub fn builder_config_typed(&self) -> HashMap<String, ConfigValue>
Trait Implementations§
Source§impl ClusteringHeuristic for ArrowSpaceBuilder
impl ClusteringHeuristic for ArrowSpaceBuilder
Source§fn compute_optimal_k(
&self,
rows: &[Vec<f64>],
n: usize,
f: usize,
seed_override: Option<u64>,
) -> (usize, f64, usize)where
Self: Sync,
fn compute_optimal_k(
&self,
rows: &[Vec<f64>],
n: usize,
f: usize,
seed_override: Option<u64>,
) -> (usize, f64, usize)where
Self: Sync,
fn step1_bounds( &self, rows: &[Vec<f64>], n: usize, f: usize, base_seed: u64, ) -> (usize, usize, usize)
Source§fn estimate_intrinsic_dimension(
&self,
rows: &[Vec<f64>],
n: usize,
f: usize,
base_seed: u64,
) -> usize
fn estimate_intrinsic_dimension( &self, rows: &[Vec<f64>], n: usize, f: usize, base_seed: u64, ) -> usize
fn step2_calinski_harabasz(
&self,
rows: &[Vec<f64>],
k_min: usize,
k_max: usize,
base_seed: u64,
) -> usizewhere
Self: Sync,
Source§fn calinski_harabasz_score(
&self,
rows: &[Vec<f64>],
assignments: &[usize],
k: usize,
) -> f64
fn calinski_harabasz_score( &self, rows: &[Vec<f64>], assignments: &[usize], k: usize, ) -> f64
fn compute_threshold_from_pilot( &self, rows: &[Vec<f64>], k: usize, base_seed: u64, ) -> f64
Source§impl Default for ArrowSpaceBuilder
impl Default for ArrowSpaceBuilder
Source§impl Display for ArrowSpaceBuilder
impl Display for ArrowSpaceBuilder
Source§fn fmt(&self, f: &mut Formatter<'_>) -> Result
fn fmt(&self, f: &mut Formatter<'_>) -> Result
Format ArrowSpaceBuilder as comma-separated key=value pairs (cookie-style).
Output format: “key1=value1, key2=value2, …” This format can be parsed into a HashMap<String, String> using cookie parsers or simple string splitting.
§Example
let builder = ArrowSpaceBuilder::new()
.with_synthesis(TauMode::Median);
let config_string = builder.to_string();
println!("{}", config_string);
// Parse back to HashMap
let config_map: HashMap<String, String> = parse_builder_config(&config_string);Source§impl EnergyMapsBuilder for ArrowSpaceBuilder
impl EnergyMapsBuilder for ArrowSpaceBuilder
Source§fn build_energy(
&mut self,
rows: Vec<Vec<f64>>,
energy_params: EnergyParams,
) -> (ArrowSpace, GraphLaplacian)
fn build_energy( &mut self, rows: Vec<Vec<f64>>, energy_params: EnergyParams, ) -> (ArrowSpace, GraphLaplacian)
Build an ArrowSpace index using the energy-only pipeline (no cosine similarity).
This method constructs a graph-based index where edges are weighted purely by energy features: node lambda (Rayleigh quotient), dispersion (edge concentration), and Dirichlet smoothness. The pipeline completely removes cosine similarity dependence from both construction and search.
§Pipeline Stages
-
Clustering & Projection: Runs incremental clustering with optional JL dimensionality reduction to produce a compact centroid representation.
-
Optical Compression (optional): If
energy_params.optical_tokensis set, applies 2D spatial binning with low-activation pooling inspired by DeepSeek-OCR to further compress centroids while preserving structural information. -
Bootstrap Laplacian L₀: Builds an initial Euclidean kNN Laplacian over centroids in the (possibly projected) feature space using neutral distance metrics.
-
Diffusion & Sub-Centroid Generation: Applies heat-flow diffusion over L₀ to smooth the centroid manifold, then splits high-dispersion nodes along local gradients to generate sub-centroids that better capture local geometry.
-
Energy Laplacian Construction: Builds the final graph where edge weights are computed from energy distances:
d = w_λ·|Δλ| + w_G·|ΔG| + w_D·Dirichlet(Δfeatures), using parallel candidate pruning and symmetric kNN with DashMap for efficiency. -
Taumode Lambda Computation: Computes per-item Rayleigh quotients (lambdas) over the energy graph using the selected synthesis mode (Mean/Median/Max), enabling energy-aware ranking during search.
-
…
-
…
2x/3x slower than build(...)
Source§fn build_energy_laplacian(
&self,
sub_centroids: &DenseMatrix<f64>,
energy_params: &EnergyParams,
) -> (GraphLaplacian, Vec<f64>, Vec<f64>)
fn build_energy_laplacian( &self, sub_centroids: &DenseMatrix<f64>, energy_params: &EnergyParams, ) -> (GraphLaplacian, Vec<f64>, Vec<f64>)
Auto Trait Implementations§
impl Freeze for ArrowSpaceBuilder
impl RefUnwindSafe for ArrowSpaceBuilder
impl Send for ArrowSpaceBuilder
impl Sync for ArrowSpaceBuilder
impl Unpin for ArrowSpaceBuilder
impl UnwindSafe for ArrowSpaceBuilder
Blanket Implementations§
Source§impl<T> BorrowMut<T> for Twhere
T: ?Sized,
impl<T> BorrowMut<T> for Twhere
T: ?Sized,
Source§fn borrow_mut(&mut self) -> &mut T
fn borrow_mut(&mut self) -> &mut T
Source§impl<T> IntoEither for T
impl<T> IntoEither for T
Source§fn into_either(self, into_left: bool) -> Either<Self, Self>
fn into_either(self, into_left: bool) -> Either<Self, Self>
self into a Left variant of Either<Self, Self>
if into_left is true.
Converts self into a Right variant of Either<Self, Self>
otherwise. Read moreSource§fn into_either_with<F>(self, into_left: F) -> Either<Self, Self>
fn into_either_with<F>(self, into_left: F) -> Either<Self, Self>
self into a Left variant of Either<Self, Self>
if into_left(&self) returns true.
Converts self into a Right variant of Either<Self, Self>
otherwise. Read moreSource§impl<T> Pointable for T
impl<T> Pointable for T
Source§impl<SS, SP> SupersetOf<SS> for SPwhere
SS: SubsetOf<SP>,
impl<SS, SP> SupersetOf<SS> for SPwhere
SS: SubsetOf<SP>,
Source§fn to_subset(&self) -> Option<SS>
fn to_subset(&self) -> Option<SS>
self from the equivalent element of its
superset. Read moreSource§fn is_in_subset(&self) -> bool
fn is_in_subset(&self) -> bool
self is actually part of its subset T (and can be converted to it).Source§unsafe fn to_subset_unchecked(&self) -> SS
unsafe fn to_subset_unchecked(&self) -> SS
self.to_subset but without any property checks. Always succeeds.Source§fn from_subset(element: &SS) -> SP
fn from_subset(element: &SS) -> SP
self to the equivalent element of its superset.