Crate ib_matcher

Source
Expand description

A multilingual and fast string matcher, supports 拼音匹配 (Chinese pinyin match) and ローマ字検索 (Japanese romaji match).

§Usage

//! cargo add ib-matcher --features pinyin,romaji
use ib_matcher::{
    matcher::{IbMatcher, PinyinMatchConfig, RomajiMatchConfig},
    pinyin::PinyinNotation,
};

let matcher = IbMatcher::builder("pysousuoeve")
    .pinyin(PinyinMatchConfig::notations(
        PinyinNotation::Ascii | PinyinNotation::AsciiFirstLetter,
    ))
    .build();
assert!(matcher.is_match("拼音搜索Everything"));

let matcher = IbMatcher::builder("konosuba")
    .romaji(RomajiMatchConfig::default())
    .is_pattern_partial(true)
    .build();
assert!(matcher.is_match("この素晴らしい世界に祝福を"));

§Features

  • pinyin — Chinese pinyin match support.

  • romaji — Japanese romaji match support.

    The dictionary will take ~5.5 MiB in the binary at the moment, much larger than pinyin’s.

  • perf (enabled by default) — Enables all performance related features. This feature is enabled by default is intended to cover all reasonable features that improve performance, even if more are added in the future.

  • perf-unicode-case-map (enabled by default) — -37% match time, +38 KiB

  • regex — Not used at the moment.

    Build size +837.5 KiB

  • inmut-data — Make pinyin::PinyinData interior mutable. So it can be easily used as a static variable.

  • minimal — Minimal APIs that can be used in one call. See minimal for details.

  • encoding — Support for non-UTF-8 encodings. Only UTF-16 and UTF-32 at the moment.

    Non-UTF-8 Japanese romaji match is not yet supported.

Re-exports§

pub use ib_romaji as romaji;romaji

Modules§

matcher
minimalminimal
Minimal APIs
pinyinpinyin
Pinyin
unicode