Matcher
A high-performance, multi-functional word matcher implemented in Rust.
Features
- Supports Multiple Matching Methods:
- Simple word matching
- Regex-based matching
- Similarity-based matching
- Text Normalization Options:
- Fanjian (Simplify traditional Chinese characters to simplified ones)
- Delete (Remove whitespaces, punctuation, and non-alphanumeric characters)
- Normalize (Normalize special characters to identifiable characters)
- PinYin (Convert Chinese characters to Pinyin for fuzzy matching)
- PinYinChar (Convert Chinese characters to Pinyin)
- Combination and Repeated Word Matching:
- Handles combination and repetition of words with specified constraints.
Usage
Adding to Your Project
To use matcher_rs in your Rust project, add the following to your Cargo.toml file:
[]
= "*"
Basic Example
Here’s a basic example of how to use the Matcher struct for text matching:
use HashMap;
use ;
let match_table_map: MatchTableMap = from_iter;
let matcher = new;
let text = "This is an example text.";
let results = matcher.word_match;
use HashMap;
use ;
let mut simple_match_type_word_map = default;
let mut simple_word_map = default;
simple_word_map.insert;
simple_word_map.insert;
simple_match_type_word_map.insert;
let matcher = new;
let text = "你好,世界!";
let results = matcher.process;
For more detailed usage examples, please refer to the test.rs file.
Benchmarks
The matcher_rs library includes benchmarks to measure the performance of the matcher. You can find the benchmarks in the bench.rs file. To run the benchmarks, use the following command:
cargo bench
Contributing
Contributions to matcher_rs are welcome! If you find a bug or have a feature request, please open an issue on the GitHub repository. If you would like to contribute code, please fork the repository and submit a pull request.
License
matcher_rs is licensed under the MIT OR Apache-2.0 license.
More Information
For more details, visit the GitHub repository.