amt-phonetic 1.0.0

Articulatory Moment Transform — language-agnostic phonetic name matching
Documentation
//! # AMT — Articulatory Moment Transform
//!
//! Language-agnostic phonetic name matching via spectral fingerprinting of
//! universal sonority class sequences.
//!
//! ## Quick start
//!
//! ```
//! use amt::{encode_token, matches, similarity};
//!
//! // Encode a single name
//! let code = encode_token("Khaled");
//!
//! // Test match across transliterations and scripts
//! assert!(matches("Khaled", "Khalid"));
//! assert!(matches("Khaled", "خالد"));
//! assert!(matches("Gamal", "Jamal"));
//! assert!(!matches("Khaled", "Robert"));
//!
//! // Graded similarity in [0, 1]
//! let s = similarity("Khaled Sameer", "khaled samir");
//! assert!(s > 0.9);
//! ```
//!
//! ## Indexed fuzzy search
//!
//! ```
//! use amt::{encode_token, BKTree};
//!
//! let mut tree: BKTree<String> = BKTree::new();
//! for name in ["Khaled", "Khalid", "Ahmed", "Robert"] {
//!     let code = encode_token(name);
//!     for &sp in &code.spectrals {
//!         tree.add(sp, name.to_string());
//!     }
//! }
//!
//! let query = encode_token("Khaleed");
//! let hits = tree.query(query.spectrals[0], 4);
//! ```
//!
//! ## Algorithm
//!
//! Each name is mapped to a sequence of 8 sonority classes, projected onto
//! the first 4 Chebyshev polynomials, Gray-quantized, and packed into a
//! 32-bit spectral key. A parallel 64-bit Bloom signature over skip-bigrams
//! of the same sequence captures edit-tolerant co-occurrence patterns.
//! Two names match if they share any spectral key.
//!
//! See the whitepaper in the repository for full details, benchmarks against
//! Soundex / Metaphone / Double Metaphone / NYSIIS / Beider-Morse, and
//! theoretical justifications.

#![warn(missing_docs)]
#![warn(rust_2018_idioms)]
#![warn(missing_debug_implementations)]
#![warn(unreachable_pub)]
#![forbid(unsafe_code)]
// Enable `#[doc(cfg(...))]` annotations on docs.rs only (requires nightly).
#![cfg_attr(docsrs, feature(doc_cfg))]

mod chebyshev;
pub mod core;
pub mod indexing;
pub mod similarity;
pub mod sonority;

// Flattened re-exports — the common path. `self::` disambiguates from `::core`.
pub use self::core::{encode, encode_batch, encode_token, preprocess, Code};
pub use self::indexing::BKTree;
pub use self::similarity::{matches, similarity, token_distance};
pub use self::sonority::{class_of, Class};