Skip to main content

ref_solver/matching/
mod.rs

1//! Reference genome matching engine and scoring algorithms.
2//!
3//! This module provides the core matching functionality:
4//!
5//! - [`engine::MatchingEngine`]: Main entry point for finding reference matches
6//! - [`scoring::MatchScore`]: Detailed similarity scores between a query and reference
7//! - [`diagnosis::MatchDiagnosis`]: Detailed analysis of differences and suggestions
8//!
9//! ## Matching Algorithm
10//!
11//! The matching process uses multiple strategies:
12//!
13//! 1. **Signature matching**: Exact match via sorted MD5 hash signature
14//! 2. **MD5-based scoring**: Jaccard similarity of MD5 checksum sets
15//! 3. **Name+length fallback**: When MD5s are missing, uses contig names and lengths
16//! 4. **Order analysis**: Detects if contigs are reordered vs. reference
17//!
18//! ## Scoring
19//!
20//! The composite score combines multiple factors:
21//!
22//! - **MD5 Jaccard**: Set similarity of sequence checksums
23//! - **Name+Length Jaccard**: Set similarity of (name, length) pairs
24//! - **Query coverage**: Fraction of query contigs matched
25//! - **Order score**: Fraction of contigs in correct relative order
26//!
27//! ## Example
28//!
29//! ```rust,no_run
30//! use ref_solver::{ReferenceCatalog, MatchingEngine, MatchingConfig, QueryHeader};
31//! use ref_solver::parsing::sam::parse_header_text;
32//!
33//! let catalog = ReferenceCatalog::load_embedded().unwrap();
34//! let query = parse_header_text("@SQ\tSN:chr1\tLN:248_956_422\n").unwrap();
35//!
36//! let engine = MatchingEngine::new(&catalog, MatchingConfig::default());
37//! let matches = engine.find_matches(&query, 5);
38//!
39//! for m in &matches {
40//!     println!("{}: {:?} ({:.1}%)",
41//!         m.reference.display_name,
42//!         m.diagnosis.match_type,
43//!         m.score.composite * 100.0
44//!     );
45//! }
46//! ```
47
48pub mod diagnosis;
49pub mod engine;
50pub mod hierarchical_engine;
51pub mod scoring;
52
53pub use diagnosis::Suggestion;