Expand description
This library contains a set of phonetic algorithms from Apache commons-codec written in Rust.
It currently implements :
- Caverphone1 : see Wikipedia.
- Caverphone2 : see Wikipedia.
- Cologne : see Wikipedia.
- DaitchMokotoffSoundex : see Wikipedia
- DoubleMetaphone : see Wikipedia
- MatchRatingApproach : see Wikipedia
- Metaphone : see Wikipedia
- Nysiis : see Wikipedia
- RefinedSoundex : see Wikipedia
- Soundex : see Wikipedia
- BeiderMorse : see Wikipedia
- Phonex see paper
Please note that most of these algorithms are design for ASCII, and they are usually design for certain use case (eg. english names, …etc).
§Feature flags
There is two features that provide default rules and Default implementation for some struct. They are not enabled by default as files are embedded into code, so it might increase binary size. It’s best to provide rules by your own.
embedded— Shorthand forembedded_bmandembedded_dmembedded_bm— Beider-Morse rules. It includes onlyanylanguage and other files that are required. All file can be found in commons-codec repositoryembedded_dm— Daitch-Mokotoff rules. They can be also found in commons-codec repository
Structs§
- Beider
Morse - This is the Beider-Morse encoder.
It needs rules to work, you can get them
from commons-codec.
If feature
embedded_bm, the default rules will be included in binary, it contains onlyanyandcommonrules from commons-codec. - Beider
Morse Builder - This is a builder to construct a BeiderMorse encoder. By default, it will use generic name type, approximate rules, it won’t concatenate multiple phonetic encoding.
- Caverphone1
- This a Caverphone 1 encoder.
- Caverphone2
- This a Caverphone 2 encoder.
- Char
Sequence - This struct is a wrapper around an
&strallowing to slice by char. - Cologne
- This a Cologne encoder.
- Config
Files - This structures contains languages set, rules and language guessing rules. It avoids parsing files multiple time and should be thread-safe.
- Daitch
Mokotoff Soundex - This the Daitch Mokotoff soundex implementation.
- Daitch
Mokotoff Soundex Builder - This is a builder for DaitchMokotoffSoundex.
- Double
Metaphone - This is the Double Metaphone implementation.
- Double
Metaphone Result - This struct represents a double metaphone result.
It contains both
primaryandalternatecode. - Match
Rating Approach - This the match rating approach Encoder.
- Metaphone
- This is the Metaphone implementation of Encoder.
- Nysiis
- This the Nysiis algorithm.
- Parse
Error - This represents a parsing error. It contains the line number, the line, and if possible the filename.
- Phonex
- Phonex is a modification of the venerable Soundex algorithm. It accounts for a few more letter combinations to improve accuracy on some data sets. It was created by A.J. Lait and Brian Randell in 1996, described in their paper “An assessment of name matching algorithms” in the Technical Report Series published by the University of Newcastle Upon Tyne Computing Science.
- Refined
Soundex - This the refined soundex implementation of Encoder.
- Soundex
- This is the Soundex implementation of Encoder.
Enums§
- BMError
- Beider-Morse errors.
- Language
Set - This represents a set of languages.
- Name
Type - Supported type of names. Unless you are matching particular family name, use generic variant as it should work reasonably well for non-name words. The other variant are specifically tune for family name and may not work well for general text.
- Phonetic
Error - Errors
- Rule
Type - Type of rules.
Constants§
- DEFAULT_
US_ ENGLISH_ GENEALOGY_ MAPPING_ SOUNDEX - A mapping from Genealogy site.
- DEFAULT_
US_ ENGLISH_ MAPPING_ SOUNDEX - This is the default mapping character for soundex.
Traits§
- Encoder
- This trait represents a phonetic algorithm.
- Soundex
Commons - This trait represent a soundex algorithm (except for Nysiis).