1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
//! Public processing helpers built on top of the step registry.
//!
//! These are the standalone entry points for the transformation pipeline. They decompose
//! a composite [`ProcessType`] into single-bit steps, apply each in order via the global
//! [`super::registry`], and return the results as borrowed or owned [`Cow<str>`] values.
//!
//! For batch matching where multiple [`ProcessType`] configurations share prefixes, prefer
//! [`super::graph::walk_process_tree`] which deduplicates intermediate results.
use Cow;
use crateProcessType;
use crateget_transform_step;
use cratereturn_string_to_pool;
/// Replaces the current owned value in a [`Cow`] and returns the old allocation to the pool.
///
/// Borrowed values are replaced without any pool interaction. Owned values are swapped out
/// and recycled so chained transformations do not leak short-lived buffers.
/// Shared implementation for the public reduction helpers.
///
/// When `overwrite_replace` is `false`, every changed step appends a new entry. When it is
/// `true`, replace-style steps overwrite the last entry in place while `Delete` still appends,
/// preserving the emitted-variant semantics used by matcher construction.
///
/// # Panics
///
/// Panics (via `.expect()`) if `text_list` is somehow emptied during iteration, which
/// cannot happen since the original input is always the first entry.
/// Applies a composite [`ProcessType`] pipeline to `text` and returns the final result.
///
/// Steps run in [`ProcessType::iter`] order (ascending bit position). If no step changes
/// the text, the return value borrows directly from `text` (zero allocation). When one or
/// more steps produce changes, intermediate allocations are recycled through the
/// thread-local string pool so only the final result is returned as `Cow::Owned`.
///
/// This function is best for one-shot use. When multiple [`ProcessType`] configurations
/// share prefixes and you want all intermediate variants, use
/// [`build_process_type_tree`](super::build_process_type_tree) +
/// [`walk_process_tree`](super::walk_process_tree) instead.
///
/// # Examples
///
/// ```rust
/// use matcher_rs::{ProcessType, text_process};
///
/// // Fanjian converts Traditional→Simplified; Delete removes punctuation.
/// let processed = text_process(ProcessType::Fanjian | ProcessType::Delete, "妳!好");
/// assert_eq!(processed, "你好");
///
/// // No-op when the text has nothing to transform.
/// let unchanged = text_process(ProcessType::Fanjian, "hello");
/// assert_eq!(unchanged, "hello");
/// // Borrowed — no allocation occurred.
/// assert!(matches!(unchanged, std::borrow::Cow::Borrowed(_)));
/// ```
/// Applies a composite [`ProcessType`] pipeline to `text`, recording every intermediate change.
///
/// Returns a `Vec` whose first element is always the original `text` (borrowed). Each
/// subsequent element is the output of a step that actually changed the text; steps that
/// leave the text unchanged are skipped. The final element is therefore the fully
/// transformed result.
///
/// This is useful for inspecting how each stage transforms the input, or for collecting
/// all intermediate forms that should be indexed.
///
/// # Examples
///
/// ```rust
/// use matcher_rs::{ProcessType, reduce_text_process};
///
/// // FanjianDeleteNormalize = Fanjian | Delete | Normalize, applied in that order.
/// let variants = reduce_text_process(ProcessType::FanjianDeleteNormalize, "~躶~A~");
/// // First entry is always the original input.
/// assert_eq!(variants[0], "~躶~A~");
/// // Last entry is the fully transformed result.
/// assert_eq!(variants.last().unwrap(), "裸a");
/// ```
/// Like [`reduce_text_process`], but merges replace-type steps in-place.
///
/// This variant is used during matcher construction to keep only the strings that the
/// Aho-Corasick automaton will actually scan at match time. Replace-style steps
/// (Fanjian, Normalize, PinYin, PinYinChar) overwrite the last entry rather than
/// appending, because the pre-replacement form is never scanned separately. Delete steps
/// still append because deletion changes which character sequences are adjacent, affecting
/// which patterns can match.
///
/// The result therefore contains fewer entries than [`reduce_text_process`]: one entry per
/// "scan boundary" rather than one per transformation step.
///
/// # Examples
///
/// ```rust
/// use matcher_rs::{ProcessType, reduce_text_process_emit};
///
/// // FanjianDeleteNormalize = Fanjian | Delete | Normalize.
/// let variants = reduce_text_process_emit(ProcessType::FanjianDeleteNormalize, "~躶~A~");
/// // Only two entries: Fanjian overwrites the original, then Delete appends.
/// // The Normalize step overwrites the Delete entry in-place.
/// assert_eq!(variants.len(), 2);
/// assert_eq!(variants[0], "~裸~A~"); // after Fanjian (replace, overwrites original)
/// assert_eq!(variants[1], "裸a"); // after Delete+Normalize
/// ```