Crate gukhanmun_core

Expand description

Core types and algorithms for Gukhanmun.

This crate is the home for the format-neutral intermediate representation, conversion engine, dictionary traits, lattice segmentation, and fallback hanja reading logic. Format adapters, command-line I/O, and language bindings live in separate crates.

Structs§

Annotation: Metadata for a dictionary-backed hanja conversion.
ChainDictionary: A dictionary composition that preserves caller-supplied priority order.
DictionaryRecord: A complete dictionary entry exposed for batch policy analysis.
Engine: Stateful hanja conversion engine for chunked token streams.
EngineOptions: Engine-level options that affect hanja conversion before rendering.
FirstOccurrenceFilter: Streaming first-occurrence middleware.
HomophoneMarker: Streaming homophone marker middleware.
MapDictionary: A small in-memory dictionary backed by an ordered map.
Match: A dictionary match that starts at the queried cursor position.
MatchMark: Dictionary-provided rendering constraints for a match.
PlainScopeData: Scope data used by the plain-text adapter.
RecoverableInputError: A recoverable reader error plus the original source region.
RedundantParenCollapser: Streaming middleware that collapses an explicit parenthetical reading annotation into the converted hanja word it duplicates.
RenderOptions: Rendering options that combine a RenderMode with per-mode sub-options.
Renderer: Stateful renderer for chunked OutputToken streams.
Scope: A structural scope in the format-neutral token stream.
UnihanCharDict: Per-character Unihan fallback readings exposed as a dictionary.
UserDirectives: User rules that adjust annotation presentation policy.

Enums§

ContextWindow: The context boundary used by stateful annotation middlewares.
DirectiveAction: Action applied when a user directive predicate matches an annotation.
Error: Error returned by fallible core pipeline entry points.
HomophoneDetection: How homophone disambiguation decides that an annotation needs its hanja shown in RenderMode::HangulOnly.
InputToken: A token emitted by a reader before hanja conversion has run.
NumeralStrategy: Strategy for rendering hanja numerals.
OriginalGloss: Form for the gloss attached to annotations in RenderMode::Original.
OutputToken: A token emitted by the engine after hanja conversion.
Recovery: Stream-level error recovery policy.
RenderMode: The concrete rendering mode for annotated hanja words.
RenderedToken: A token emitted by a renderer after all annotations have been expanded.
RubyBase: Selects which side of a <ruby> element is the base text.
SegmentationStrategy: Strategy used to segment hanja-containing spans.

Traits§

HanjaDictionary: A hanja dictionary queried by the conversion engine.
ScopeData: Adapter-owned data attached to an intermediate-representation scope.

Functions§

apply_user_directives: Applies literal user directives to annotation policy flags.
apply_user_directives_iter: Lazily applies literal user directives to an output token stream.
collapse_redundant_parens: Buffered counterpart to RedundantParenCollapser for non-streaming callers, mirroring mark_homophones_with_detection and filter_first_occurrences.
convert_plain_text: Converts plain text through reader, engine, renderer, and writer stages.
convert_plain_text_with_options: Converts plain text with explicit hanja conversion engine options.
filter_first_occurrences: Clears repeat gloss requirements after the first occurrence of each hanja.
is_hanja: Returns whether ch is in a known CJK ideograph range.
mark_homophones: Sets homophone on dictionary annotations sharing a reading.
mark_homophones_with_detection: Sets homophone on dictionary annotations sharing a reading, choosing the detection strategy explicitly.
process_fallible_tokens: Processes fallible input tokens with default engine options.
process_fallible_tokens_with_options: Processes fallible input tokens with explicit engine options.
process_tokens: Processes input tokens with the default hanja conversion engine options.
process_tokens_iter: Processes input tokens through the default engine options and returns an iterator over the collected output.
process_tokens_iter_with_options: Processes input tokens through explicit engine options and returns an iterator over the collected output.
process_tokens_with_options: Processes input tokens with explicit hanja conversion engine options.
read_plain_text: Reads a plain-text string into the core input-token stream.
recover_input_token: Resolves one fallible reader item according to a Recovery policy.
recover_input_tokens: Resolves a fallible reader token stream into recovered input tokens.
render_tokens: Renders engine output tokens into annotation-free tokens.
render_tokens_iter: Renders engine output tokens into annotation-free tokens as an iterator.
write_plain_text: Writes rendered plain-text tokens back to a string.

Crate gukhanmun_core

Crate gukhanmun_core Copy item path

Structs§

Enums§

Traits§

Functions§

Crate gukhanmun_core