1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
"""
Enum representing different types of match tables.
Attributes:
Simple: Represents a simple match type.
SimilarChar: Represents a match type where similar characters are matched.
Acrostic: Represents a match type based on acrostics.
SimilarTextLevenshtein: Represents a match type using the Levenshtein distance algorithm to find similar text.
Regex: Represents a match type using regular expressions.
"""
=
=
=
=
=
"""
IntFlag representing different types of simple matches.
Attributes:
None: No transformations applied.
Fanjian: Match simplified and traditional Chinese characters.
Delete: Match with deletion of certain characters.
Normalize: Match with normalization of certain characters.
DeleteNormalize: Match with normalization and deletion of certain characters.
FanjianDeleteNormalize: Match both simplified and traditional Chinese characters, with normalization and deletion of certain characters.
PinYin: Match using Pinyin, the Romanization of Chinese characters, considering character boundaries.
PinYinChar: Match using Pinyin, the Romanization of Chinese characters, without considering character boundaries.
"""
= 0b00000001
= 0b00000010
= 0b00001100
= 0b00010000
= 0b00011100
= 0b00011110
= 0b00100000
= 0b01000000
"""
Data structure for representing a match table.
Attributes:
table_id (int): Unique identifier for the match table.
match_table_type (MatchTableType): Type of matching applied in the table.
simple_match_type (SimpleMatchType): Specific simple match criteria used.
word_list (List[str]): List of words that the matching operates against.
exemption_simple_match_type (SimpleMatchType): Simple match criteria to be exempted.
exemption_word_list (List[str]): List of words that are exempted from matching.
"""
:
:
:
:
:
:
=
:
:
=
:
:
=