gukhanmun-unihan
Code generator that downloads the Unicode Unihan database and produces the
unihan_readings.rs source file that gukhanmun-core compiles in for
fallback hanja phonetization.
This is a development-time tool, not a library. Normal users of Gukhanmun do not need to run it: the generated file is committed to the repository and updated only when the Unicode version changes or the extraction logic is revised.
What it generates
The tool reads the kHangul field from Unihan_Readings.txt inside
Unihan.zip and emits a sorted static array mapping Unicode scalar values
to their Korean readings. gukhanmun-core compiles this array into its
fallback phonetizer so that characters not found in any loaded dictionary
still receive a plausible reading.
The Unicode version and the expected SHA-256 of Unihan.zip are pinned as constants in the source. A checksum mismatch causes the tool to abort, so accidental use of a different Unicode release is caught immediately.
Running
The download is cached next to the output path between runs.
License
GPL-3.0-only. See LICENSE at the repository root.