1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
//! Translation source/target language detection.
//!
//! Neither routine matches a hardcoded natural-language phrase. Each asks the
//! language-independent meaning lexicon (`data/seed/meanings-translation.lino`)
//! "which surface forms evidence a translation *source* / *target* marker?" and
//! resolves the marker's language by walking its `defined_by` edges down to one
//! of the four `language_*` meanings. Adding a spelling variant, a synonym, or a
//! whole new supported language is therefore a pure data edit: drop a
//! `word`/`description` into the relevant marker meaning and this code reasons
//! about it automatically.
//!
//! Surfaces are matched as raw substrings ([`str::contains`]), exactly as the
//! previous hardcoded disjunction did, so detection stays byte-faithful — a
//! Chinese marker like `从中文` has no inter-word spaces, and a Cyrillic marker
//! like `с английского` must match inside a longer sentence. Marker meanings are
//! walked in declaration order (English → Russian → Hindi → Chinese), which
//! preserves the original first-match priority.
use crate;
/// Detect the language a translation reads *from*, or `None`.
///
/// Walks every meaning carrying [`ROLE_TRANSLATION_SOURCE_MARKER`] and returns
/// the language of the first whose surface appears in `normalized`.
/// Detect the language a translation renders *into*, or `None`.
///
/// Walks every meaning carrying [`ROLE_TRANSLATION_TARGET_MARKER`] and returns
/// the language of the first whose surface appears in `normalized`.
/// The shared recogniser: the first marker meaning of `role` (in declaration
/// order) whose any surface word is a substring of `normalized` reports its
/// language, read off the `language_*` meaning it is `defined_by`.
/// The ISO 639-1 code of the `language_*` meaning a marker is `defined_by`.
/// Map a `language_*` meaning slug to its fixed ISO 639-1 code.
///
/// The code is the one identifier that stays in the handler: it is the key the
/// [`crate::translation::TranslationPipeline`] and the Wiktionary client are
/// addressed by, not a surface word. The surface *names* of each language live
/// in the seed; only this slug → code bridge is code.