Expand description
§RuVector Data Discovery Framework
Core traits and types for building dataset integrations with RuVector’s vector memory, graph structures, and dynamic minimum cut algorithms.
§Architecture
The framework provides three core abstractions:
- DataIngester: Streaming data ingestion with batched graph/vector updates
- CoherenceEngine: Real-time coherence signal computation using min-cut
- DiscoveryEngine: Pattern detection for emerging structures and anomalies
§Quick Start
ⓘ
use ruvector_data_framework::{
DataIngester, CoherenceEngine, DiscoveryEngine,
IngestionConfig, CoherenceConfig, DiscoveryConfig,
};
// Configure the discovery pipeline
let ingester = DataIngester::new(ingestion_config);
let coherence = CoherenceEngine::new(coherence_config);
let discovery = DiscoveryEngine::new(discovery_config);
// Stream data and detect patterns
let stream = ingester.stream_from_source(source).await?;
let signals = coherence.compute_signals(stream).await?;
let patterns = discovery.detect_patterns(signals).await?;Re-exports§
pub use academic_clients::CoreClient;pub use academic_clients::EricClient;pub use academic_clients::UnpaywallClient;pub use api_clients::EdgarClient;pub use api_clients::Embedder;pub use api_clients::NoaaClient;pub use api_clients::OpenAlexClient;pub use api_clients::SimpleEmbedder;pub use arxiv_client::ArxivClient;pub use biorxiv_client::BiorxivClient;pub use biorxiv_client::MedrxivClient;pub use crossref_client::CrossRefClient;pub use economic_clients::AlphaVantageClient;pub use economic_clients::FredClient;pub use economic_clients::WorldBankClient;pub use finance_clients::BlsClient;pub use finance_clients::CoinGeckoClient;pub use finance_clients::EcbClient;pub use finance_clients::FinnhubClient;pub use finance_clients::TwelveDataClient;pub use genomics_clients::EnsemblClient;pub use genomics_clients::GwasClient;pub use genomics_clients::NcbiClient;pub use genomics_clients::UniProtClient;pub use geospatial_clients::GeonamesClient;pub use geospatial_clients::NominatimClient;pub use geospatial_clients::OpenElevationClient;pub use geospatial_clients::OverpassClient;pub use government_clients::CensusClient;pub use government_clients::DataGovClient;pub use government_clients::EuOpenDataClient;pub use government_clients::UkGovClient;pub use government_clients::UNDataClient;pub use government_clients::WorldBankClient as WorldBankGovClient;pub use medical_clients::ClinicalTrialsClient;pub use medical_clients::FdaClient;pub use medical_clients::PubMedClient;pub use ml_clients::HuggingFaceClient;pub use ml_clients::HuggingFaceDataset;pub use ml_clients::HuggingFaceModel;pub use ml_clients::OllamaClient;pub use ml_clients::OllamaModel;pub use ml_clients::PapersWithCodeClient;pub use ml_clients::PaperWithCodeDataset;pub use ml_clients::PaperWithCodePaper;pub use ml_clients::ReplicateClient;pub use ml_clients::ReplicateModel;pub use ml_clients::TogetherAiClient;pub use ml_clients::TogetherModel;pub use news_clients::GuardianClient;pub use news_clients::HackerNewsClient;pub use news_clients::NewsDataClient;pub use news_clients::RedditClient;pub use patent_clients::EpoClient;pub use patent_clients::UsptoPatentClient;pub use physics_clients::ArgoClient;pub use physics_clients::CernOpenDataClient;pub use physics_clients::GeoUtils;pub use physics_clients::MaterialsProjectClient;pub use physics_clients::UsgsEarthquakeClient;pub use semantic_scholar::SemanticScholarClient;pub use space_clients::AstronomyClient;pub use space_clients::ExoplanetClient;pub use space_clients::NasaClient;pub use space_clients::SpaceXClient;pub use transportation_clients::GtfsClient;pub use transportation_clients::MobilityDatabaseClient;pub use transportation_clients::OpenChargeMapClient;pub use transportation_clients::OpenRouteServiceClient;pub use wiki_clients::WikidataClient;pub use wiki_clients::WikidataEntity;pub use wiki_clients::WikipediaClient;pub use coherence::CoherenceBoundary;pub use coherence::CoherenceConfig;pub use coherence::CoherenceEngine;pub use coherence::CoherenceEvent;pub use coherence::CoherenceSignal;pub use cut_aware_hnsw::CutAwareHNSW;pub use cut_aware_hnsw::CutAwareConfig;pub use cut_aware_hnsw::CutAwareMetrics;pub use cut_aware_hnsw::CoherenceZone;pub use cut_aware_hnsw::SearchResult as CutAwareSearchResult;pub use cut_aware_hnsw::EdgeUpdate as CutAwareEdgeUpdate;pub use cut_aware_hnsw::UpdateKind;pub use cut_aware_hnsw::LayerCutStats;pub use discovery::DiscoveryConfig;pub use discovery::DiscoveryEngine;pub use discovery::DiscoveryPattern;pub use discovery::PatternCategory;pub use discovery::PatternStrength;pub use dynamic_mincut::CutGatedSearch;pub use dynamic_mincut::CutWatcherConfig;pub use dynamic_mincut::DynamicCutWatcher;pub use dynamic_mincut::DynamicMinCutError;pub use dynamic_mincut::EdgeUpdate as DynamicEdgeUpdate;pub use dynamic_mincut::EdgeUpdateType;pub use dynamic_mincut::EulerTourTree;pub use dynamic_mincut::HNSWGraph;pub use dynamic_mincut::LocalCut;pub use dynamic_mincut::LocalMinCutProcedure;pub use dynamic_mincut::WatcherStats;pub use export::export_all;pub use export::export_coherence_csv;pub use export::export_dot;pub use export::export_graphml;pub use export::export_patterns_csv;pub use export::export_patterns_with_evidence_csv;pub use export::ExportFilter;pub use forecasting::CoherenceForecaster;pub use forecasting::CrossDomainForecaster;pub use forecasting::Forecast;pub use forecasting::Trend;pub use ingester::DataIngester;pub use ingester::IngestionConfig;pub use ingester::IngestionStats;pub use ingester::SourceConfig;pub use realtime::FeedItem;pub use realtime::FeedSource;pub use realtime::NewsAggregator;pub use realtime::NewsSource;pub use realtime::RealTimeEngine;pub use ruvector_native::CoherenceHistoryEntry;pub use ruvector_native::CoherenceSnapshot;pub use ruvector_native::Domain;pub use ruvector_native::DiscoveredPattern;pub use ruvector_native::GraphExport;pub use ruvector_native::NativeDiscoveryEngine;pub use ruvector_native::NativeEngineConfig;pub use ruvector_native::SemanticVector;pub use streaming::StreamingConfig;pub use streaming::StreamingEngine;pub use streaming::StreamingEngineBuilder;pub use streaming::StreamingMetrics;
Modules§
- academic_
clients - Academic & Research API clients for scholarly data discovery
- api_
clients - Real API client integrations for OpenAlex, NOAA, and SEC EDGAR
- arxiv_
client - ArXiv Preprint API Integration
- biorxiv_
client - bioRxiv and medRxiv Preprint API Integration
- coherence
- Coherence signal computation using dynamic minimum cut algorithms
- crossref_
client - CrossRef API Integration
- cut_
aware_ hnsw - Cut-Aware HNSW: Dynamic Min-Cut Integration with Vector Search
- discovery
- Discovery engine for detecting novel patterns from coherence signals
- dynamic_
mincut - Dynamic Min-Cut Tracking for RuVector
- economic_
clients - Economic data API integrations for FRED, World Bank, and Alpha Vantage
- export
- Export module for RuVector Discovery Framework
- finance_
clients - Finance & Economics API integrations for market data and economic indicators
- forecasting
- genomics_
clients - Genomics and DNA data API integrations for NCBI, UniProt, Ensembl, and GWAS Catalog
- geospatial_
clients - Geospatial & Mapping API integrations
- government_
clients - Government and International Organization API Integrations
- hnsw
- HNSW (Hierarchical Navigable Small World) Index
- ingester
- Data ingestion pipeline for streaming data into RuVector
- mcp_
server - MCP (Model Context Protocol) Server for RuVector Data Discovery
- medical_
clients - Medical data API integrations for PubMed, ClinicalTrials.gov, and FDA
- ml_
clients - AI/ML API Client Integrations
- news_
clients - News & Social Media API client integrations
- optimized
- Optimized Discovery Engine
- patent_
clients - Patent database API integrations for USPTO PatentsView and EPO
- persistence
- Persistence Layer for RuVector Discovery Framework
- physics_
clients - Physics, seismic, and ocean data API integrations
- realtime
- Real-Time Data Feed Integration
- ruvector_
native - RuVector-Native Discovery Engine
- semantic_
scholar - Semantic Scholar API Integration
- space_
clients - NASA and space data API integrations
- streaming
- Real-time Streaming Data Ingestion
- transportation_
clients - Transportation and Mobility API Integrations
- utils
- Shared utility functions for the RuVector Data Framework
- visualization
- ASCII Art Visualization for Discovery Framework
- wiki_
clients - Wikipedia and Wikidata API clients for knowledge graph building
Structs§
- Data
Record - A timestamped data record from any source
- Discovery
Pipeline - Main discovery pipeline orchestrator
- Discovery
Stats - Statistics for a discovery session
- Pipeline
Config - Configuration for the entire discovery pipeline
- Relationship
- A relationship between two records
- Temporal
Window - Temporal window for time-series analysis
Enums§
- Framework
Error - Framework error types
Traits§
- Data
Source - Trait for data sources that can be ingested
- Embedding
Provider - Trait for computing embeddings from records
- Graph
Builder - Trait for graph building from records
Type Aliases§
- Result
- Result type for framework operations