Module tantivy_analysis_contrib::phonetic
source · Available on crate feature
phonetic
only.Expand description
This module provides phonetic capabilities through several algorithms.
It contains the following algorithms :
- Beider-Morse
- Caverphone 1 & 2
- Cologne
- Daitch Mokotoff Soundex
- Double Metaphone
- Match Rating Approach
- Metaphone
- Nysiis
- Refined Soundex
- Soundex
- Phonex
To get a PhoneticTokenFilter you need to use PhoneticAlgorithm :
use tantivy_analysis_contrib::phonetic::{
Mapping,
PhoneticAlgorithm,
PhoneticTokenFilter,
SpecialHW
};
let algorithm = PhoneticAlgorithm::Soundex(Mapping(None), SpecialHW(None));
let token_filter = PhoneticTokenFilter::try_from(algorithm)?;
Every parameter of PhoneticAlgorithm’s variant is typed to try to make it clear what is their purpose. Most of them are Option allowing to use default values.
Structs§
- This boolean allows generating alternate code, in double metaphone, if different from primary.
- Boolean to allow (
true
) or disallow (false
) branching for Daitch-Mokotoff. - If a text contains multiple words, they all get encode if
true
otherwise only the first word will be encoded. - DMRule
embedded_dm
This is Daitch-Mokotoff rules. They will be parsed. You can find commons-codec’s rules here - Boolean to apply folding (
true
) in Daitch-Mokotoff. - This is the mapping for each latin letter for Soundex and Refined Soundex.
- Allow setting the maximum length in PhoneticAlgorithm.
- Allow setting the maximum length in BeiderMorse.
- This the phonetic token filter. It generates a token according to the algorithm provided.
- Indicate, for Soundex, if
H
andW
should be treated as silence. - This boolean indicates if Nysiis algorithm should be strict or not.
Enums§
- Beider-Morse errors.
- Errors from encoder.
- This represents a set of languages.
- Supported type of names. Unless you are matching particular family name, use generic variant as it should work reasonably well for non-name words. The other variant are specifically tune for family name and may not work well for general text.
- These are different algorithms from rphonetic crate.
- Errors
- Type of rules.