Expand description
This library contains a set of phonetic algorithms from Apache commons-codec written in Rust.
It currently implements :
- Caverphone1 : see Wikipedia.
- Caverphone2 : see Wikipedia.
- Cologne : see Wikipedia.
- DaitchMokotoffSoundex : see Wikipedia
- DoubleMetaphone : see Wikipedia
- MatchRatingApproach : see Wikipedia
- Metaphone : see Wikipedia
- Nysiis : see Wikipedia
- RefinedSoundex : see Wikipedia
- Soundex : see Wikipedia
- BeiderMorse : see Wikipedia
- Phonex see paper
Please note that most of these algorithms are design for ASCII, and they are usually design for certain use case (eg. english names, …etc).
§Feature flags
There is two features that provide default rules and Default implementation for some struct. They are not enabled by default as files are embedded into code, so it might increase binary size. It’s best to provide rules by your own.
embedded
— Shorthand forembedded_bm
andembedded_dm
embedded_bm
— Beider-Morse rules. It includes onlyany
language and other files that are required. All file can be found in commons-codec repositoryembedded_dm
— Daitch-Mokotoff rules. They can be also found in commons-codec repository
Structs§
- This is the Beider-Morse encoder. It needs rules to work, you can get them from commons-codec. If feature
embedded_bm
, the default rules will be included in binary, it contains onlyany
andcommon
rules from commons-codec. - This is a builder to construct a BeiderMorse encoder. By default, it will use generic name type, approximate rules, it won’t concatenate multiple phonetic encoding.
- This a Caverphone 1 encoder.
- This a Caverphone 2 encoder.
- This struct is a wrapper around an
&str
allowing to slice by char. - This a Cologne encoder.
- This structures contains languages set, rules and language guessing rules. It avoids parsing files multiple time and should be thread-safe.
- This the Daitch Mokotoff soundex implementation.
- This is a builder for DaitchMokotoffSoundex.
- This is the Double Metaphone implementation.
- This struct represents a double metaphone result. It contains both
primary
andalternate
code. - This the match rating approach Encoder.
- This the Nysiis algorithm.
- This represents a parsing error. It contains the line number, the line, and if possible the filename.
- Phonex is a modification of the venerable Soundex algorithm. It accounts for a few more letter combinations to improve accuracy on some data sets. It was created by A.J. Lait and Brian Randell in 1996, described in their paper “An assessment of name matching algorithms” in the Technical Report Series published by the University of Newcastle Upon Tyne Computing Science.
- This the refined soundex implementation of Encoder.
Enums§
- Beider-Morse errors.
- This represents a set of languages.
- Supported type of names. Unless you are matching particular family name, use generic variant as it should work reasonably well for non-name words. The other variant are specifically tune for family name and may not work well for general text.
- Errors
- Type of rules.
Constants§
- A mapping from Genealogy site.
- This is the default mapping character for soundex.
Traits§
- This trait represents a phonetic algorithm.
- This trait represent a soundex algorithm (except for Nysiis).