metaphone3
A pure Rust implementation of the Metaphone 3 phonetic encoding algorithm.
Metaphone 3 is a more accurate version of the original Soundex algorithm, designed so that similar-sounding words in American English share the same keys. This makes it useful for fuzzy matching, searching names, and comparing words phonetically.
Features
- Pure Rust: No FFI or external dependencies required (only
smartstringfor efficient string handling) - Primary and Secondary Encodings: Generates both primary and alternate phonetic keys for better matching
- Vowel Encoding: Optional encoding of vowel sounds for finer phonetic distinction
- Exact Mode: Optional stricter encoding that differentiates similar sounds (e.g., hard "G" vs hard "K")
- Reusable Encoder: Designed to minimize allocations when encoding multiple words
- Builder Pattern: Fluent API for configuration
Installation
Add this to your Cargo.toml:
[]
= "0.1"
Usage
Basic Usage
use Metaphone3;
With Options
use Metaphone3;
Reusing the Encoder
The encoder is designed to be reused across multiple encode calls to reduce memory allocations:
use Metaphone3;
Output:
Smith: SM0 / XMT
Smyth: SM0 / XMT
Smithe: SM0 / XMT
Smythe: SM0 / XMT
Schmidt: XMT /
API Reference
Metaphone3
The main encoder struct.
Methods
| Method | Description |
|---|---|
new() -> Self |
Creates a new encoder with default settings |
with_encode_vowels(self, bool) -> Self |
Enables/disables vowel encoding |
with_encode_exact(self, bool) -> Self |
Enables/disables exact encoding mode |
encode(&mut self, &str) -> (String, String) |
Encodes a word, returning (primary, secondary) keys |
Configuration Options
| Option | Default | Description |
|---|---|---|
encode_vowels |
false |
When true, includes non-initial vowel sounds in the output |
encode_exact |
false |
When true, produces stricter encodings that differentiate similar sounds |
Output
The encode() method returns a tuple of two strings:
- Primary: The main phonetic encoding (always present for non-empty input)
- Secondary: An alternate encoding when the word has ambiguous pronunciation (empty string if none)
Both encodings are limited to 8 characters maximum.
Matching Strategy
For best results when searching for phonetic matches:
- Encode your search term and target words
- Match where either primary or secondary keys match
use Metaphone3;
Examples
| Word | Primary | Secondary |
|---|---|---|
| Smith | SM0 | XMT |
| phonetics | FNTKS | |
| Xavier | SFR | |
| edge | AJ | |
| gnome | NM | |
| Thompson | TMPSN | |
| Aachen | AKN | AXN |
| Wroclaw | RKL |
Algorithm Background
Metaphone 3 was developed by Lawrence Philips as an improvement over the original Metaphone and Double Metaphone algorithms. It provides:
- More accurate encoding of English words and names
- Better handling of non-English origin names common in America
- Support for both primary and alternate pronunciations
- Improved consonant and vowel sound mappings
For more information about the Metaphone algorithm family, see the Wikipedia article.
Thread Safety
The Metaphone3 encoder is not thread-safe. Each thread should use its own encoder instance. The encoder is designed to be cheap to construct, so creating one per thread is recommended.
References
- Original Metaphone 3 implementation: OpenRefine Metaphone3.java
- Go implementation this port is based on: dlclark/metaphone3
License
MIT
Contributing
Contributions are welcome! Please feel free to submit issues and pull requests.