Skip to main content

Crate moine_ja

Crate moine_ja 

Source
Expand description

Japanese kana, romaji, override, and UniDic adapters for moine.

This crate converts Japanese surface text into romaji lattices through direct kana/ASCII handling, manual override dictionaries, or UniDic-derived reading artifacts. The language-independent edit-distance algorithms remain in moine-core.

Dictionary artifacts are external input. Prefer try_* lookup and expansion APIs at trust boundaries so indexed-payload decode errors are reported as UnidicArtifactPayloadError instead of being collapsed into empty lookup results for backward-compatible convenience APIs.

use moine_ja::romaji_lattice;
use moine_core::{distance, Lattice};

let left = romaji_lattice("もいにゃ").unwrap();
let right = Lattice::from_paths(["moinya"]);

assert_eq!(distance(&left, &right), 0);

Structs§

DictionaryReadingExpansion
Reading-path expansion result plus pruning statistics.
DictionaryReadingOptions
Controls dictionary reading-path expansion.
DictionaryReadingPath
One complete segmentation and joined reading for an input string.
DictionaryReadingSegment
One surface segment and its selected UniDic reading.
DictionaryReadingStats
Counters describing dictionary reading-path expansion.
JapaneseDistance
Distances computed for one Japanese comparison.
OverrideDictionary
In-memory surface-to-reading override dictionary.
RomajiVariantTable
Kana-to-romaji variant table used by the Japanese adapter.
UnidicArtifactBuild
Build settings and counts recorded in UniDic artifact metadata.
UnidicArtifactLicense
License metadata for a UniDic-derived artifact.
UnidicArtifactLicenseReference
One license or notice file referenced by artifact metadata.
UnidicArtifactMetadata
Metadata stored in a UniDic dictionary bundle.
UnidicArtifactMetadataOptions
Inputs used to generate artifact metadata for an index.
UnidicArtifactPayload
Payload file metadata for a UniDic dictionary bundle.
UnidicArtifactQueryDefaults
Default reading-path query settings stored in an artifact.
UnidicArtifactSource
Source dictionary metadata for a UniDic artifact.
UnidicBinaryArtifactPayloadHeader
Header for legacy binary UniDic payloads.
UnidicIndexOptions
Options used while building a UniDic reading index.
UnidicReadingIndex
UniDic-derived surface-to-reading index.
UnidicReadingIndexPayload
Portable YAML representation of a UniDic reading index.
UnidicReadingIndexPayloadEntry
One surface entry in a UniDic reading-index payload.

Enums§

JaLatticeError
Errors returned while building Japanese romaji lattices.
OverrideLoadError
Errors returned while loading an override dictionary.
UnidicArtifactPayloadError
Errors returned while reading or validating UniDic artifact payloads.
UnidicCsvError
Errors returned while reading UniDic CSV resources.
UnidicReadingField
UniDic CSV field used as the source reading.

Constants§

ARTIFACT_PAYLOAD_CHECKSUM_ALGORITHM
Current canonical checksum algorithm for normalized UniDic payload content.
ARTIFACT_PAYLOAD_FILE_DIGEST_ALGORITHM
File digest algorithm used to verify payload bytes before loading.
LEGACY_ARTIFACT_PAYLOAD_CHECKSUM_ALGORITHM
Legacy canonical checksum algorithm accepted for older UniDic artifacts.

Functions§

artifact_file_digest_path
Computes the SHA-256 file digest string for a UniDic artifact payload file.
artifact_file_digest_reader
Computes the SHA-256 file digest string from a reader.
compare_with_overrides
Compares two strings using direct kana/romaji handling plus overrides.
compare_with_unidic_index
Compares two strings using direct handling and a UniDic reading index.
is_kana
Returns whether ch is hiragana, katakana, or the long-vowel mark.
normalize_kana
Normalizes katakana in input to hiragana.
normalize_kana_char
Normalizes one katakana character to hiragana when possible.
normalized_similarity_with_unidic_index
Computes the best normalized similarity across UniDic-backed readings.
romaji_lattice
Builds a compact romaji lattice from kana or ASCII romaji input.
romaji_lattice_from_reading_paths
Builds a compact romaji lattice from dictionary reading paths.
romaji_paths
Expands kana or ASCII romaji input into explicit romaji paths.
romaji_paths_from_reading_paths
Expands dictionary reading paths into explicit romaji strings.
unidic_or_direct_lattice
Builds a romaji lattice from direct input, dictionary readings, or both.
unidic_or_direct_romaji_paths
Returns romaji paths from direct input, dictionary readings, or both.