ib-matcher
A multilingual, flexible and fast string, glob and regex matcher. Support 拼音匹配 (Chinese pinyin match) and ローマ字検索 (Japanese romaji match).
Features
- Unicode support
- Fully UTF-8 support and limited support for UTF-16 and UTF-32.
- Unicode case insensitivity (simple case folding).
- Chinese pinyin matching (拼音匹配)
- Support characters with multiple readings (i.e. heteronyms, 多音字).
- Support multiple pinyin notations, including Quanpin (全拼), Jianpin (简拼) and many Shuangpin (双拼) notations.
- Support mixing multiple notations during matching.
- Japanese romaji matching (ローマ字検索)
- Support characters with multiple readings (i.e. heteronyms, 同形異音語).
- Support Hepburn romanization system only at the moment.
- glob()-style pattern matching (i.e.
?
,*
,[]
and**
)- Support different anchor modes, treating surrounding wildcards as anchors and special anchors in file paths.
- Support two seperators (
//
) or a complement separator (\
) as a glob star (*/**
).
- Regular expression
- Support the same syntax as
regex
, including wildcards, repetitions, alternations, groups, etc. - Support custom matching callbacks, which can be used to implement ad hoc look-around, backreferences, balancing groups/recursion/subroutines, combining domain-specific parsers, etc.
- Support the same syntax as
- Relatively high performance
- Generally on par with the
regex
crate, depending on the case it can be faster or slower.
- Generally on par with the
And all of the above features are optional. You don't need to pay the performance and binary size cost for features you don't use.
See documentation for details.
You can also use ib-pinyin if you only need Chinese pinyin match, which is simpler and more stable.
Usage
// cargo add ib-matcher --features pinyin,romaji
use ;
let matcher = builder.build;
assert!;
let matcher = builder.build;
assert!;
assert!;
let matcher = builder
.pinyin
.build;
assert!;
let matcher = builder
.romaji
.is_pattern_partial
.build;
assert!;
glob()-style pattern matching
See glob
module for more details. Here is a quick example:
// cargo add ib-matcher --features syntax-glob,regex,romaji
use ;
let re = builder
.ib
.build_from_hir
.unwrap;
assert!;
Regular expression
See regex
module for more details. Here is a quick example:
// cargo add ib-matcher --features regex,pinyin,romaji
use ;
let config = builder
.pinyin
.romaji
.build;
let re = builder
.ib
.build
.unwrap;
assert_eq!;
let re = builder
.ib
.build
.unwrap;
assert_eq!;
let config = builder
.pinyin
.romaji
.mix_lang
.build;
let re = builder
.ib
.build
.unwrap;
assert_eq!;
// cargo add ib-matcher --features regex,regex-callback
use Regex;
let re = builder
.callback
.build
.unwrap;
let hay = "that4U this4me";
assert_eq!;
Test