pub struct SphereQLPipeline { /* private fields */ }Expand description
The main SphereQL pipeline: fitted projection + spatial index + category enrichment layer + optional tunable config.
Build one with Self::new for defaults,
Self::new_with_config for an explicit PipelineConfig, or
Self::new_from_metamodel / Self::new_from_metamodel_tuned
to consult a trained meta-model on past tuner runs.
Implementations§
Source§impl SphereQLPipeline
impl SphereQLPipeline
Sourcepub fn new(input: PipelineInput) -> Result<Self, PipelineError>
pub fn new(input: PipelineInput) -> Result<Self, PipelineError>
Build a pipeline from raw inputs with PipelineConfig::default.
input.categories[i]is the category for sentenceiinput.embeddings[i]is the embedding vector for sentencei- All embedding vectors must have the same dimensionality (>= 3).
Sourcepub fn new_with_config(
input: PipelineInput,
config: PipelineConfig,
) -> Result<Self, PipelineError>
pub fn new_with_config( input: PipelineInput, config: PipelineConfig, ) -> Result<Self, PipelineError>
Build a pipeline with an explicit configuration. Fits the projection
internally using PipelineConfig::projection_kind and any relevant
sub-config (e.g. LaplacianConfig).
Sourcepub fn new_from_metamodel<M: MetaModel>(
input: PipelineInput,
model: &M,
) -> Result<(Self, CorpusFeatures, PipelineConfig), PipelineError>
pub fn new_from_metamodel<M: MetaModel>( input: PipelineInput, model: &M, ) -> Result<(Self, CorpusFeatures, PipelineConfig), PipelineError>
Build a pipeline using a config predicted by a MetaModel.
Extracts CorpusFeatures from the input, asks the model for a
predicted PipelineConfig, then builds the pipeline with it.
Returns the pipeline alongside the extracted features and the
predicted config so the caller can log, audit, or save them as a
new MetaTrainingRecord.
This is the “tune-or-recall” entry point: once you’ve accumulated
a handful of training records, call this instead of
crate::tuner::auto_tune when you want to skip
search entirely. For a warm-start hybrid that does some tuning
on top of the prediction, use Self::new_from_metamodel_tuned.
Sourcepub fn new_from_metamodel_tuned<M, Q>(
input: PipelineInput,
model: &M,
space: &SearchSpace,
metric: &Q,
strategy: SearchStrategy,
) -> Result<(Self, CorpusFeatures, TuneReport), PipelineError>where
M: MetaModel,
Q: QualityMetric,
pub fn new_from_metamodel_tuned<M, Q>(
input: PipelineInput,
model: &M,
space: &SearchSpace,
metric: &Q,
strategy: SearchStrategy,
) -> Result<(Self, CorpusFeatures, TuneReport), PipelineError>where
M: MetaModel,
Q: QualityMetric,
Warm-started hybrid: predict a config with model, then run a
small-budget tuner pass using that prediction as base_config.
The prediction supplies values only for knobs NOT enumerated by
space — any knob the space lists is searched cold across its
axes, and the predicted value for it is ignored. Under
SearchStrategy::Random and SearchStrategy::Bayesian the
predicted config itself is additionally evaluated as trial 0
(counted against the budget), so it competes directly with the
searched candidates; SearchStrategy::Grid skips that seed
trial to keep its trial set the exact Cartesian enumeration.
Returns the winning pipeline, the extracted corpus features, and
the full TuneReport. Callers can feed the report back into
MetaTrainingRecord::from_tune_result
to accumulate more training data for the next recall.
Sourcepub fn with_projection(
categories: Vec<String>,
embeddings: Vec<Embedding>,
pca: PcaProjection,
) -> Result<Self, PipelineError>
pub fn with_projection( categories: Vec<String>, embeddings: Vec<Embedding>, pca: PcaProjection, ) -> Result<Self, PipelineError>
Build a pipeline from pre-computed embeddings and an existing PCA
projection, with PipelineConfig::default.
This is the legacy entry point — use
Self::with_configured_projection_and_config directly when you
have a non-PCA ConfiguredProjection.
Sourcepub fn with_projection_and_config(
categories: Vec<String>,
embeddings: Vec<Embedding>,
pca: PcaProjection,
config: PipelineConfig,
) -> Result<Self, PipelineError>
pub fn with_projection_and_config( categories: Vec<String>, embeddings: Vec<Embedding>, pca: PcaProjection, config: PipelineConfig, ) -> Result<Self, PipelineError>
Legacy configurable PCA entry point. Prefer
Self::with_configured_projection_and_config for new code.
Sourcepub fn with_configured_projection_and_config(
categories: Vec<String>,
embeddings: Vec<Embedding>,
projection: ConfiguredProjection,
config: PipelineConfig,
) -> Result<Self, PipelineError>
pub fn with_configured_projection_and_config( categories: Vec<String>, embeddings: Vec<Embedding>, projection: ConfiguredProjection, config: PipelineConfig, ) -> Result<Self, PipelineError>
Core pipeline constructor: accepts any ConfiguredProjection and
a PipelineConfig.
Sourcepub fn has_category(&self, name: &str) -> bool
pub fn has_category(&self, name: &str) -> bool
True if name is a known category in this pipeline. Pair with
Self::query to disambiguate “unknown category” from
“category exists but is disconnected on the graph” without
pattern-matching on PipelineError::UnknownCategory.
Sourcepub fn ids(&self) -> &[String]
pub fn ids(&self) -> &[String]
All indexed item ids, in the order they were inserted (i.e. parallel
to the input embeddings/categories). Currently auto-generated as
s-{i:04} strings; callers that need stable mapping back to their
own ids should keep their own parallel array.
Sourcepub fn query(
&self,
q: SphereQLQuery<'_>,
query_embedding: &PipelineQuery,
) -> Result<SphereQLOutput, PipelineError>
pub fn query( &self, q: SphereQLQuery<'_>, query_embedding: &PipelineQuery, ) -> Result<SphereQLOutput, PipelineError>
Execute a typed query against the pipeline.
Returns PipelineError::UnknownCategory when a category
query references a name not in the pipeline, and
PipelineError::UnknownId when a concept-path query
references an id not in the index. Previously those paths
collapsed into empty results / None, which callers couldn’t
distinguish from legitimate “found nothing” outcomes.
Sourcepub fn categories(&self) -> &[String]
pub fn categories(&self) -> &[String]
Slice of per-item category labels (index-aligned with insertion order).
Sourcepub fn projected_points(&self) -> Vec<(&str, &str, [f64; 3])>
pub fn projected_points(&self) -> Vec<(&str, &str, [f64; 3])>
Export (id, category, cartesian [x, y, z]) triples for every indexed item.
Sourcepub fn projection(&self) -> &ConfiguredProjection
pub fn projection(&self) -> &ConfiguredProjection
Borrow the fitted projection regardless of kind.
Returns a &ConfiguredProjection, which implements the
crate::projection::Projection trait — so most
callers never need to pattern-match on the enum. The old
.pca() accessor was removed because it panicked under any
non-PCA config and every caller already worked through this
method or its trait impl.
Sourcepub fn projection_kind(&self) -> ProjectionKind
pub fn projection_kind(&self) -> ProjectionKind
Active outer-sphere projection kind.
Sourcepub fn exported_points(&self) -> Vec<ExportedPoint>
pub fn exported_points(&self) -> Vec<ExportedPoint>
Export all projected points with their Cartesian and spherical coordinates.
Returns one ExportedPoint per indexed item, in insertion order.
Sourcepub fn explained_variance_ratio(&self) -> f64
pub fn explained_variance_ratio(&self) -> f64
The active projection’s explained-variance-ratio-equivalent
quality score, in [0, 1]. PCA returns the classical EVR;
kernel PCA returns its kernel-space EVR; Laplacian eigenmap
returns a compatible connectivity ratio (see
LaplacianEigenmapProjection::connectivity_ratio);
UMAP returns its kNN-recall — the fraction of each point’s
high-dimensional neighbors preserved on the sphere. All four feed
the EVR-adaptive thresholds downstream.
Sourcepub fn num_categories(&self) -> usize
pub fn num_categories(&self) -> usize
Number of unique categories in the corpus.
Sourcepub fn unique_categories(&self) -> Vec<String>
pub fn unique_categories(&self) -> Vec<String>
Unique category names in insertion order.
Sourcepub fn category_layer(&self) -> &CategoryLayer
pub fn category_layer(&self) -> &CategoryLayer
Access the category enrichment layer directly.
Sourcepub fn category_path(&self, source: &str, target: &str) -> Option<CategoryPath>
pub fn category_path(&self, source: &str, target: &str) -> Option<CategoryPath>
Shortcut: find the shortest path between two categories.
Sourcepub fn bridge_items(
&self,
source: &str,
target: &str,
max: usize,
) -> Vec<&BridgeItem>
pub fn bridge_items( &self, source: &str, target: &str, max: usize, ) -> Vec<&BridgeItem>
Shortcut: get bridge items between two categories.
Sourcepub fn has_inner_sphere(&self, category: &str) -> bool
pub fn has_inner_sphere(&self, category: &str) -> bool
Shortcut: check if a category has an inner sphere.
Sourcepub fn num_inner_spheres(&self) -> usize
pub fn num_inner_spheres(&self) -> usize
Shortcut: number of categories with inner spheres.
Sourcepub fn inner_sphere_stats(&self) -> Vec<InnerSphereReport>
pub fn inner_sphere_stats(&self) -> Vec<InnerSphereReport>
Shortcut: inner sphere statistics for all categories.
Sourcepub fn projection_warnings(&self) -> &[ProjectionWarning]
pub fn projection_warnings(&self) -> &[ProjectionWarning]
Projection quality warnings. Empty if EVR is above threshold.
Sourcepub fn raw_embeddings(&self) -> Option<&[Vec<f64>]>
pub fn raw_embeddings(&self) -> Option<&[Vec<f64>]>
Returns the original high-dimensional embeddings if the retain-embeddings
feature was active at construction time. The returned slice is aligned with
ids(), categories(), and projected_points().
Returns None if the feature was not active or embeddings were not retained.
Sourcepub fn embedding_dim(&self) -> usize
pub fn embedding_dim(&self) -> usize
Embedding dimensionality (length of each embedding vector), or 0 if embeddings are not retained.
Sourcepub fn pairwise_similarities(&self) -> Option<Result<Vec<f64>, SphereQlError>>
pub fn pairwise_similarities(&self) -> Option<Result<Vec<f64>, SphereQlError>>
Compute the pairwise cosine similarity matrix from the retained raw
embeddings. Returns the upper triangle as a flat vector aligned with
ids() ordering.
Returns None if embeddings were not retained (feature retain-embeddings
not active at construction).
Returns Err if the stored embeddings have mismatched dimensions
(should not happen if the pipeline was constructed correctly).
Sourcepub fn nearest_by_embedding(
&self,
query_embedding: &[f64],
k: usize,
) -> Option<Result<Vec<(usize, f64)>, SphereQlError>>
pub fn nearest_by_embedding( &self, query_embedding: &[f64], k: usize, ) -> Option<Result<Vec<(usize, f64)>, SphereQlError>>
Find the k concepts most similar to query_embedding by cosine
similarity in the original embedding space (not the projected space).
Returns (index, similarity) pairs sorted by descending similarity,
where index aligns with ids(), categories(), and projected_points().
Returns None if embeddings were not retained.
Returns Err(DimensionMismatch) if query_embedding.len() differs from
the stored embedding dimensionality.
Cost: scans every retained embedding — O(N·D) similarity
computations — with a size-k heap keeping selection at
O(N log k) time and O(k) extra allocation.
Sourcepub fn domain_groups(&self) -> &[DomainGroup]
pub fn domain_groups(&self) -> &[DomainGroup]
Coarse-grained domain groups detected from Voronoi adjacency + cap overlap.
Single source of truth: the same vector used by default_nearest’s
inner-sphere routing and hierarchical_nearest’s drill-down.
Sourcepub fn route_to_group(&self, embedding: &Embedding) -> Option<&DomainGroup>
pub fn route_to_group(&self, embedding: &Embedding) -> Option<&DomainGroup>
Coarse routing: find the domain group whose centroid is angularly nearest to the query’s projected position.
Sourcepub fn hierarchical_nearest(
&self,
embedding: &Embedding,
k: usize,
) -> Vec<NearestResult>
pub fn hierarchical_nearest( &self, embedding: &Embedding, k: usize, ) -> Vec<NearestResult>
Hierarchical nearest-neighbor search: group → category → items.
When EVR is at or above
RoutingConfig::low_evr_threshold,
this is a plain outer-sphere k-NN (identical to SphereQLQuery::Nearest).
Below that threshold the outer sphere is unreliable, so we:
- Route the query to its nearest domain group.
- Drill down into each member category using its inner sphere (or the outer sphere if none exists).
- Merge the per-category results, sort by distance, truncate to
k.
Sourcepub fn default_nearest(
&self,
embedding: &Embedding,
k: usize,
) -> Vec<NearestResult>
pub fn default_nearest( &self, embedding: &Embedding, k: usize, ) -> Vec<NearestResult>
Default nearest path (v2 routing).
Routes the query to its closest domain group when the outer
projection’s EVR is below [HIGH_EVR_ROUTING_BYPASS], the choice
is unambiguous (d_nearest / d_second_nearest < group_routing_alpha),
and the group has an inner sphere; otherwise falls back to plain
outer-sphere k-NN. At EVR ≥ [HIGH_EVR_ROUTING_BYPASS] routing is
bypassed entirely — the outer angular distances are already more
accurate than any inner-sphere re-projection.
Sourcepub fn quality_config(&self) -> &QualityConfig
pub fn quality_config(&self) -> &QualityConfig
Current quality configuration.
Sourcepub fn set_quality_config(&mut self, config: QualityConfig)
pub fn set_quality_config(&mut self, config: QualityConfig)
Update the quality configuration (e.g., to enable filtering).
Sourcepub fn annotate_relations(&mut self, labels: &[String])
pub fn annotate_relations(&mut self, labels: &[String])
Annotate every bridge in the category layer with an inferred
RelationType.
labels[i] must correspond to the same item as
BridgeItem::item_index == i — i.e., the pipeline’s item list.
Sourcepub fn config(&self) -> &PipelineConfig
pub fn config(&self) -> &PipelineConfig
Full tunable configuration this pipeline was built with.
Auto Trait Implementations§
impl !Freeze for SphereQLPipeline
impl !RefUnwindSafe for SphereQLPipeline
impl !UnwindSafe for SphereQLPipeline
impl Send for SphereQLPipeline
impl Sync for SphereQLPipeline
impl Unpin for SphereQLPipeline
impl UnsafeUnpin for SphereQLPipeline
Blanket Implementations§
Source§impl<T> BorrowMut<T> for Twhere
T: ?Sized,
impl<T> BorrowMut<T> for Twhere
T: ?Sized,
Source§fn borrow_mut(&mut self) -> &mut T
fn borrow_mut(&mut self) -> &mut T
Source§impl<T> Instrument for T
impl<T> Instrument for T
Source§fn instrument(self, span: Span) -> Instrumented<Self>
fn instrument(self, span: Span) -> Instrumented<Self>
Source§fn in_current_span(self) -> Instrumented<Self>
fn in_current_span(self) -> Instrumented<Self>
Source§impl<T> IntoEither for T
impl<T> IntoEither for T
Source§fn into_either(self, into_left: bool) -> Either<Self, Self>
fn into_either(self, into_left: bool) -> Either<Self, Self>
self into a Left variant of Either<Self, Self>
if into_left is true.
Converts self into a Right variant of Either<Self, Self>
otherwise. Read moreSource§fn into_either_with<F>(self, into_left: F) -> Either<Self, Self>
fn into_either_with<F>(self, into_left: F) -> Either<Self, Self>
self into a Left variant of Either<Self, Self>
if into_left(&self) returns true.
Converts self into a Right variant of Either<Self, Self>
otherwise. Read more