Crate ruvector_data_framework

Crate ruvector_data_framework 

Source
Expand description

§RuVector Data Discovery Framework

Core traits and types for building dataset integrations with RuVector’s vector memory, graph structures, and dynamic minimum cut algorithms.

§Architecture

The framework provides three core abstractions:

  1. DataIngester: Streaming data ingestion with batched graph/vector updates
  2. CoherenceEngine: Real-time coherence signal computation using min-cut
  3. DiscoveryEngine: Pattern detection for emerging structures and anomalies

§Quick Start

use ruvector_data_framework::{
    DataIngester, CoherenceEngine, DiscoveryEngine,
    IngestionConfig, CoherenceConfig, DiscoveryConfig,
};

// Configure the discovery pipeline
let ingester = DataIngester::new(ingestion_config);
let coherence = CoherenceEngine::new(coherence_config);
let discovery = DiscoveryEngine::new(discovery_config);

// Stream data and detect patterns
let stream = ingester.stream_from_source(source).await?;
let signals = coherence.compute_signals(stream).await?;
let patterns = discovery.detect_patterns(signals).await?;

Re-exports§

pub use academic_clients::CoreClient;
pub use academic_clients::EricClient;
pub use academic_clients::UnpaywallClient;
pub use api_clients::EdgarClient;
pub use api_clients::Embedder;
pub use api_clients::NoaaClient;
pub use api_clients::OpenAlexClient;
pub use api_clients::SimpleEmbedder;
pub use arxiv_client::ArxivClient;
pub use biorxiv_client::BiorxivClient;
pub use biorxiv_client::MedrxivClient;
pub use crossref_client::CrossRefClient;
pub use economic_clients::AlphaVantageClient;
pub use economic_clients::FredClient;
pub use economic_clients::WorldBankClient;
pub use finance_clients::BlsClient;
pub use finance_clients::CoinGeckoClient;
pub use finance_clients::EcbClient;
pub use finance_clients::FinnhubClient;
pub use finance_clients::TwelveDataClient;
pub use genomics_clients::EnsemblClient;
pub use genomics_clients::GwasClient;
pub use genomics_clients::NcbiClient;
pub use genomics_clients::UniProtClient;
pub use geospatial_clients::GeonamesClient;
pub use geospatial_clients::NominatimClient;
pub use geospatial_clients::OpenElevationClient;
pub use geospatial_clients::OverpassClient;
pub use government_clients::CensusClient;
pub use government_clients::DataGovClient;
pub use government_clients::EuOpenDataClient;
pub use government_clients::UkGovClient;
pub use government_clients::UNDataClient;
pub use government_clients::WorldBankClient as WorldBankGovClient;
pub use medical_clients::ClinicalTrialsClient;
pub use medical_clients::FdaClient;
pub use medical_clients::PubMedClient;
pub use ml_clients::HuggingFaceClient;
pub use ml_clients::HuggingFaceDataset;
pub use ml_clients::HuggingFaceModel;
pub use ml_clients::OllamaClient;
pub use ml_clients::OllamaModel;
pub use ml_clients::PapersWithCodeClient;
pub use ml_clients::PaperWithCodeDataset;
pub use ml_clients::PaperWithCodePaper;
pub use ml_clients::ReplicateClient;
pub use ml_clients::ReplicateModel;
pub use ml_clients::TogetherAiClient;
pub use ml_clients::TogetherModel;
pub use news_clients::GuardianClient;
pub use news_clients::HackerNewsClient;
pub use news_clients::NewsDataClient;
pub use news_clients::RedditClient;
pub use patent_clients::EpoClient;
pub use patent_clients::UsptoPatentClient;
pub use physics_clients::ArgoClient;
pub use physics_clients::CernOpenDataClient;
pub use physics_clients::GeoUtils;
pub use physics_clients::MaterialsProjectClient;
pub use physics_clients::UsgsEarthquakeClient;
pub use semantic_scholar::SemanticScholarClient;
pub use space_clients::AstronomyClient;
pub use space_clients::ExoplanetClient;
pub use space_clients::NasaClient;
pub use space_clients::SpaceXClient;
pub use transportation_clients::GtfsClient;
pub use transportation_clients::MobilityDatabaseClient;
pub use transportation_clients::OpenChargeMapClient;
pub use transportation_clients::OpenRouteServiceClient;
pub use wiki_clients::WikidataClient;
pub use wiki_clients::WikidataEntity;
pub use wiki_clients::WikipediaClient;
pub use coherence::CoherenceBoundary;
pub use coherence::CoherenceConfig;
pub use coherence::CoherenceEngine;
pub use coherence::CoherenceEvent;
pub use coherence::CoherenceSignal;
pub use cut_aware_hnsw::CutAwareHNSW;
pub use cut_aware_hnsw::CutAwareConfig;
pub use cut_aware_hnsw::CutAwareMetrics;
pub use cut_aware_hnsw::CoherenceZone;
pub use cut_aware_hnsw::SearchResult as CutAwareSearchResult;
pub use cut_aware_hnsw::EdgeUpdate as CutAwareEdgeUpdate;
pub use cut_aware_hnsw::UpdateKind;
pub use cut_aware_hnsw::LayerCutStats;
pub use discovery::DiscoveryConfig;
pub use discovery::DiscoveryEngine;
pub use discovery::DiscoveryPattern;
pub use discovery::PatternCategory;
pub use discovery::PatternStrength;
pub use dynamic_mincut::CutGatedSearch;
pub use dynamic_mincut::CutWatcherConfig;
pub use dynamic_mincut::DynamicCutWatcher;
pub use dynamic_mincut::DynamicMinCutError;
pub use dynamic_mincut::EdgeUpdate as DynamicEdgeUpdate;
pub use dynamic_mincut::EdgeUpdateType;
pub use dynamic_mincut::EulerTourTree;
pub use dynamic_mincut::HNSWGraph;
pub use dynamic_mincut::LocalCut;
pub use dynamic_mincut::LocalMinCutProcedure;
pub use dynamic_mincut::WatcherStats;
pub use export::export_all;
pub use export::export_coherence_csv;
pub use export::export_dot;
pub use export::export_graphml;
pub use export::export_patterns_csv;
pub use export::export_patterns_with_evidence_csv;
pub use export::ExportFilter;
pub use forecasting::CoherenceForecaster;
pub use forecasting::CrossDomainForecaster;
pub use forecasting::Forecast;
pub use forecasting::Trend;
pub use ingester::DataIngester;
pub use ingester::IngestionConfig;
pub use ingester::IngestionStats;
pub use ingester::SourceConfig;
pub use realtime::FeedItem;
pub use realtime::FeedSource;
pub use realtime::NewsAggregator;
pub use realtime::NewsSource;
pub use realtime::RealTimeEngine;
pub use ruvector_native::CoherenceHistoryEntry;
pub use ruvector_native::CoherenceSnapshot;
pub use ruvector_native::Domain;
pub use ruvector_native::DiscoveredPattern;
pub use ruvector_native::GraphExport;
pub use ruvector_native::NativeDiscoveryEngine;
pub use ruvector_native::NativeEngineConfig;
pub use ruvector_native::SemanticVector;
pub use streaming::StreamingConfig;
pub use streaming::StreamingEngine;
pub use streaming::StreamingEngineBuilder;
pub use streaming::StreamingMetrics;

Modules§

academic_clients
Academic & Research API clients for scholarly data discovery
api_clients
Real API client integrations for OpenAlex, NOAA, and SEC EDGAR
arxiv_client
ArXiv Preprint API Integration
biorxiv_client
bioRxiv and medRxiv Preprint API Integration
coherence
Coherence signal computation using dynamic minimum cut algorithms
crossref_client
CrossRef API Integration
cut_aware_hnsw
Cut-Aware HNSW: Dynamic Min-Cut Integration with Vector Search
discovery
Discovery engine for detecting novel patterns from coherence signals
dynamic_mincut
Dynamic Min-Cut Tracking for RuVector
economic_clients
Economic data API integrations for FRED, World Bank, and Alpha Vantage
export
Export module for RuVector Discovery Framework
finance_clients
Finance & Economics API integrations for market data and economic indicators
forecasting
genomics_clients
Genomics and DNA data API integrations for NCBI, UniProt, Ensembl, and GWAS Catalog
geospatial_clients
Geospatial & Mapping API integrations
government_clients
Government and International Organization API Integrations
hnsw
HNSW (Hierarchical Navigable Small World) Index
ingester
Data ingestion pipeline for streaming data into RuVector
mcp_server
MCP (Model Context Protocol) Server for RuVector Data Discovery
medical_clients
Medical data API integrations for PubMed, ClinicalTrials.gov, and FDA
ml_clients
AI/ML API Client Integrations
news_clients
News & Social Media API client integrations
optimized
Optimized Discovery Engine
patent_clients
Patent database API integrations for USPTO PatentsView and EPO
persistence
Persistence Layer for RuVector Discovery Framework
physics_clients
Physics, seismic, and ocean data API integrations
realtime
Real-Time Data Feed Integration
ruvector_native
RuVector-Native Discovery Engine
semantic_scholar
Semantic Scholar API Integration
space_clients
NASA and space data API integrations
streaming
Real-time Streaming Data Ingestion
transportation_clients
Transportation and Mobility API Integrations
utils
Shared utility functions for the RuVector Data Framework
visualization
ASCII Art Visualization for Discovery Framework
wiki_clients
Wikipedia and Wikidata API clients for knowledge graph building

Structs§

DataRecord
A timestamped data record from any source
DiscoveryPipeline
Main discovery pipeline orchestrator
DiscoveryStats
Statistics for a discovery session
PipelineConfig
Configuration for the entire discovery pipeline
Relationship
A relationship between two records
TemporalWindow
Temporal window for time-series analysis

Enums§

FrameworkError
Framework error types

Traits§

DataSource
Trait for data sources that can be ingested
EmbeddingProvider
Trait for computing embeddings from records
GraphBuilder
Trait for graph building from records

Type Aliases§

Result
Result type for framework operations