polyphon 1.0.5

Implementation of phonetic Polyphon algorithm
Documentation

About

It's an effective Rust implementation of phonetic Polyphon algorithm.

Original paper: «Polyphon: An Algorithm for Phonetic String Matching in Russian Language».

Authors: Viacheslav V. Paramonov, Alexey O. Shigarov, Gennagy M. Ruzhnikov, Polina V. Belykh.

We propose a new phonetic algorithm to string matching in Russian language without transliteration from Cyrillic to Latin characters. It is based on the rules of sounds formation in Russian language.

Usage

Add the dependency:

[dependencies]

polyphon = "1.0"

And then use:

use polyphon::encode;

let code = encode("литие"); // -> "лата"

Note: encode works on a single word and removes any non-Russian characters (including spaces). If you want to encode multiple words, split them first and encode each separately.

Python API

There is a python wrapper for the public API. You can find it here.

Code structure

  • src/lib.rs — public interface;
  • src/normalize.rs — normalization (incl. normalize function);
  • src/rules.rs — other algorithm steps.