runmat_runtime/builtins/strings/transform/
replace.rs

1//! MATLAB-compatible `replace` builtin with GPU-aware semantics for RunMat.
2
3use runmat_builtins::{CellArray, CharArray, StringArray, Value};
4use runmat_macros::runtime_builtin;
5
6use crate::builtins::common::spec::{
7    BroadcastSemantics, BuiltinFusionSpec, BuiltinGpuSpec, ConstantStrategy, GpuOpKind,
8    ReductionNaN, ResidencyPolicy, ShapeRequirements,
9};
10use crate::builtins::strings::common::{char_row_to_string_slice, is_missing_string};
11#[cfg(feature = "doc_export")]
12use crate::register_builtin_doc_text;
13use crate::{gather_if_needed, make_cell, register_builtin_fusion_spec, register_builtin_gpu_spec};
14
15#[cfg(feature = "doc_export")]
16pub const DOC_MD: &str = r#"---
17title: "replace"
18category: "strings/transform"
19keywords: ["replace", "substring replace", "string replace", "strrep", "text replace", "character array replace"]
20summary: "Replace substring occurrences in strings, character arrays, and cell arrays."
21references:
22  - https://www.mathworks.com/help/matlab/ref/replace.html
23gpu_support:
24  elementwise: false
25  reduction: false
26  precisions: []
27  broadcasting: "none"
28  notes: "Runs on the CPU. RunMat gathers GPU-resident text before performing replacements."
29fusion:
30  elementwise: false
31  reduction: false
32  max_inputs: 3
33  constants: "inline"
34requires_feature: null
35tested:
36  unit: "builtins::strings::transform::replace::tests"
37  integration: "builtins::strings::transform::replace::tests::replace_cell_array_mixed_content"
38---
39
40# What does the `replace` function do in MATLAB / RunMat?
41`replace(str, old, new)` substitutes every occurrence of `old` found in `str` with `new`. The builtin
42accepts string scalars, string arrays, character arrays, and cell arrays of character vectors or strings,
43matching MATLAB semantics. Multiple search terms are supported—each matched entry is replaced by its
44corresponding replacement text.
45
46## How does the `replace` function behave in MATLAB / RunMat?
47- String scalars remain strings. Missing string scalars (`<missing>`) propagate unchanged.
48- String arrays are processed element wise while preserving shape.
49- Character arrays are handled row by row. Rows expand or shrink as needed and are padded with spaces so
50  the result remains a rectangular char array, just like MATLAB.
51- Cell arrays must contain string scalars or character vectors. The result is a cell array with the same
52  size and element types mirrored after replacement.
53- The `old` and `new` inputs can be string scalars, string arrays, character arrays, or cell arrays of
54  character vectors / strings. `new` must be a scalar or match the number of search terms in `old`.
55- Non-text inputs (numeric, logical, structs, GPU tensors, etc.) produce MATLAB-compatible errors.
56
57## `replace` Function GPU Execution Behaviour
58`replace` executes on the CPU. The builtin registers as an Accelerate sink, so the fusion planner never
59attempts to keep results on the device. When any argument is GPU-resident, RunMat gathers it to host memory
60before performing replacements. Providers do not need special kernels for this builtin, and GPU-resident
61results are returned on the host.
62
63## GPU residency in RunMat (Do I need `gpuArray`?)
64No. `replace` automatically gathers GPU inputs back to the host when necessary. Because it is marked with a
65gather-immediately residency policy, both inputs and outputs live on the CPU, so you never have to move
66text values manually. This mirrors MATLAB behaviour where string manipulation runs on the CPU.
67
68## Examples of using the `replace` function in MATLAB / RunMat
69
70### Replace all instances of a word in a string
71```matlab
72txt = "RunMat accelerates MATLAB code";
73result = replace(txt, "RunMat", "RunMat Accelerate");
74```
75Expected output:
76```matlab
77result = "RunMat Accelerate accelerates MATLAB code"
78```
79
80### Replace multiple terms in a string array
81```matlab
82labels = ["GPU pipeline"; "CPU pipeline"];
83result = replace(labels, ["GPU", "CPU"], ["Device", "Host"]);
84```
85Expected output:
86```matlab
87result = 2×1 string
88    "Device pipeline"
89    "Host pipeline"
90```
91
92### Replace substrings in a character array while preserving padding
93```matlab
94chars = char("alpha", "beta ");
95out = replace(chars, "a", "A");
96```
97Expected output:
98```matlab
99out =
100
101  2×5 char array
102
103    'AlphA'
104    'betA '
105```
106
107### Replace text within a cell array of character vectors
108```matlab
109C = {'Kernel Fusion', 'GPU Planner'};
110updated = replace(C, {'Kernel', 'GPU'}, {'Shader', 'Device'});
111```
112Expected output:
113```matlab
114updated = 1×2 cell array
115    {'Shader Fusion'}    {'Device Planner'}
116```
117
118### Remove substrings by replacing with empty text
119```matlab
120paths = ["runmat/bin", "runmat/lib"];
121clean = replace(paths, "runmat/", "");
122```
123Expected output:
124```matlab
125clean = 1×2 string
126    "bin"    "lib"
127```
128
129### Replace using scalar replacement for multiple search terms
130```matlab
131message = "OpenCL or CUDA or Vulkan";
132unified = replace(message, ["OpenCL", "CUDA", "Vulkan"], "GPU backend");
133```
134Expected output:
135```matlab
136unified = "GPU backend or GPU backend or GPU backend"
137```
138
139### Replace text stored inside a cell array of strings
140```matlab
141cells = { "Snapshot", "Ignition Interpreter" };
142renamed = replace(cells, " ", "_");
143```
144Expected output:
145```matlab
146renamed = 1×2 cell array
147    {"Snapshot"}    {"Ignition_Interpreter"}
148```
149
150### Preserve missing strings during replacement
151```matlab
152vals = ["runmat", "<missing>", "accelerate"];
153out = replace(vals, "runmat", "RunMat");
154```
155Expected output:
156```matlab
157out = 1×3 string
158    "RunMat"    <missing>    "accelerate"
159```
160
161## FAQ
162
163### What sizes are allowed for `old` and `new` inputs?
164`old` must contain at least one search term. `new` may be a scalar or contain the same number of elements
165as `old`. Otherwise, `replace` raises a size-mismatch error matching MATLAB behaviour.
166
167### Does `replace` modify the original input?
168No. The builtin returns a new value with substitutions applied. The original inputs are left untouched.
169
170### How are character arrays padded after replacement?
171Each row is expanded or truncated according to the longest resulting row. Shorter rows are padded with space
172characters so the output remains a proper char matrix.
173
174### How are missing strings handled?
175Missing string scalars (`<missing>`) propagate unchanged. Replacements never convert a missing value into a
176non-missing string.
177
178### Can I replace with an empty string?
179Yes. Provide `""` (empty string) or `''` as the replacement to remove matched substrings entirely.
180
181### Does `replace` support overlapping matches?
182Replacements are non-overlapping and proceed from left to right, matching MATLAB’s behaviour for `replace`.
183
184### How does `replace` behave with GPU data?
185RunMat gathers GPU-resident inputs to host memory before performing replacements. The resulting value is
186returned on the host. Providers do not need to implement a GPU kernel for this builtin.
187
188## See Also
189[regexprep](../../regex/regexprep),
190[string](../core/string),
191[char](../core/char),
192[strtrim](../transform/strtrim),
193[strip](../transform/strip)
194
195## Source & Feedback
196- Implementation: [`crates/runmat-runtime/src/builtins/strings/transform/replace.rs`](https://github.com/runmat-org/runmat/blob/main/crates/runmat-runtime/src/builtins/strings/transform/replace.rs)
197- Found an issue? Please [open a GitHub issue](https://github.com/runmat-org/runmat/issues/new/choose) with a minimal reproduction.
198"#;
199
200pub const GPU_SPEC: BuiltinGpuSpec = BuiltinGpuSpec {
201    name: "replace",
202    op_kind: GpuOpKind::Custom("string-transform"),
203    supported_precisions: &[],
204    broadcast: BroadcastSemantics::None,
205    provider_hooks: &[],
206    constant_strategy: ConstantStrategy::InlineLiteral,
207    residency: ResidencyPolicy::GatherImmediately,
208    nan_mode: ReductionNaN::Include,
209    two_pass_threshold: None,
210    workgroup_size: None,
211    accepts_nan_mode: false,
212    notes:
213        "Executes on the CPU; GPU-resident inputs are gathered to host memory prior to replacement.",
214};
215
216register_builtin_gpu_spec!(GPU_SPEC);
217
218pub const FUSION_SPEC: BuiltinFusionSpec = BuiltinFusionSpec {
219    name: "replace",
220    shape: ShapeRequirements::Any,
221    constant_strategy: ConstantStrategy::InlineLiteral,
222    elementwise: None,
223    reduction: None,
224    emits_nan: false,
225    notes:
226        "String manipulation builtin; not eligible for fusion plans and always gathers GPU inputs.",
227};
228
229register_builtin_fusion_spec!(FUSION_SPEC);
230
231#[cfg(feature = "doc_export")]
232register_builtin_doc_text!("replace", DOC_MD);
233
234const ARG_TYPE_ERROR: &str =
235    "replace: first argument must be a string array, character array, or cell array of character vectors";
236const PATTERN_TYPE_ERROR: &str =
237    "replace: second argument must be a string array, character array, or cell array of character vectors";
238const REPLACEMENT_TYPE_ERROR: &str =
239    "replace: third argument must be a string array, character array, or cell array of character vectors";
240const EMPTY_PATTERN_ERROR: &str =
241    "replace: second argument must contain at least one search string";
242const EMPTY_REPLACEMENT_ERROR: &str =
243    "replace: third argument must contain at least one replacement string";
244const SIZE_MISMATCH_ERROR: &str =
245    "replace: replacement array must be a scalar or match the number of search strings";
246const CELL_ELEMENT_ERROR: &str =
247    "replace: cell array elements must be string scalars or character vectors";
248
249#[runtime_builtin(
250    name = "replace",
251    category = "strings/transform",
252    summary = "Replace substring occurrences in strings, character arrays, and cell arrays.",
253    keywords = "replace,strrep,strings,character array,text",
254    accel = "sink"
255)]
256fn replace_builtin(text: Value, old: Value, new: Value) -> Result<Value, String> {
257    let text = gather_if_needed(&text).map_err(|e| format!("replace: {e}"))?;
258    let old = gather_if_needed(&old).map_err(|e| format!("replace: {e}"))?;
259    let new = gather_if_needed(&new).map_err(|e| format!("replace: {e}"))?;
260
261    let spec = ReplacementSpec::from_values(&old, &new)?;
262
263    match text {
264        Value::String(s) => Ok(Value::String(replace_string_scalar(s, &spec))),
265        Value::StringArray(sa) => replace_string_array(sa, &spec),
266        Value::CharArray(ca) => replace_char_array(ca, &spec),
267        Value::Cell(cell) => replace_cell_array(cell, &spec),
268        _ => Err(ARG_TYPE_ERROR.to_string()),
269    }
270}
271
272fn replace_string_scalar(text: String, spec: &ReplacementSpec) -> String {
273    if is_missing_string(&text) {
274        text
275    } else {
276        spec.apply(&text)
277    }
278}
279
280fn replace_string_array(array: StringArray, spec: &ReplacementSpec) -> Result<Value, String> {
281    let StringArray { data, shape, .. } = array;
282    let mut replaced = Vec::with_capacity(data.len());
283    for entry in data {
284        if is_missing_string(&entry) {
285            replaced.push(entry);
286        } else {
287            replaced.push(spec.apply(&entry));
288        }
289    }
290    let result = StringArray::new(replaced, shape).map_err(|e| format!("replace: {e}"))?;
291    Ok(Value::StringArray(result))
292}
293
294fn replace_char_array(array: CharArray, spec: &ReplacementSpec) -> Result<Value, String> {
295    let CharArray { data, rows, cols } = array;
296    if rows == 0 {
297        return Ok(Value::CharArray(CharArray { data, rows, cols }));
298    }
299
300    let mut replaced_rows = Vec::with_capacity(rows);
301    let mut target_cols = 0usize;
302    for row in 0..rows {
303        let slice = char_row_to_string_slice(&data, cols, row);
304        let replaced = spec.apply(&slice);
305        let len = replaced.chars().count();
306        target_cols = target_cols.max(len);
307        replaced_rows.push(replaced);
308    }
309
310    let mut flattened = Vec::with_capacity(rows * target_cols);
311    for row_text in replaced_rows {
312        let mut chars: Vec<char> = row_text.chars().collect();
313        if chars.len() < target_cols {
314            chars.resize(target_cols, ' ');
315        }
316        flattened.extend(chars);
317    }
318
319    CharArray::new(flattened, rows, target_cols)
320        .map(Value::CharArray)
321        .map_err(|e| format!("replace: {e}"))
322}
323
324fn replace_cell_array(cell: CellArray, spec: &ReplacementSpec) -> Result<Value, String> {
325    let CellArray {
326        data, rows, cols, ..
327    } = cell;
328    let mut replaced = Vec::with_capacity(rows * cols);
329    for row in 0..rows {
330        for col in 0..cols {
331            let idx = row * cols + col;
332            let value = replace_cell_element(&data[idx], spec)?;
333            replaced.push(value);
334        }
335    }
336    make_cell(replaced, rows, cols).map_err(|e| format!("replace: {e}"))
337}
338
339fn replace_cell_element(value: &Value, spec: &ReplacementSpec) -> Result<Value, String> {
340    match value {
341        Value::String(text) => Ok(Value::String(replace_string_scalar(text.clone(), spec))),
342        Value::StringArray(sa) if sa.data.len() == 1 => Ok(Value::String(replace_string_scalar(
343            sa.data[0].clone(),
344            spec,
345        ))),
346        Value::CharArray(ca) if ca.rows <= 1 => replace_char_array(ca.clone(), spec),
347        Value::CharArray(_) => Err(CELL_ELEMENT_ERROR.to_string()),
348        _ => Err(CELL_ELEMENT_ERROR.to_string()),
349    }
350}
351
352fn extract_pattern_list(value: &Value) -> Result<Vec<String>, String> {
353    extract_text_list(value, PATTERN_TYPE_ERROR)
354}
355
356fn extract_replacement_list(value: &Value) -> Result<Vec<String>, String> {
357    extract_text_list(value, REPLACEMENT_TYPE_ERROR)
358}
359
360fn extract_text_list(value: &Value, type_error: &str) -> Result<Vec<String>, String> {
361    match value {
362        Value::String(text) => Ok(vec![text.clone()]),
363        Value::StringArray(array) => Ok(array.data.clone()),
364        Value::CharArray(array) => {
365            let CharArray { data, rows, cols } = array.clone();
366            if rows == 0 {
367                Ok(Vec::new())
368            } else {
369                let mut entries = Vec::with_capacity(rows);
370                for row in 0..rows {
371                    entries.push(char_row_to_string_slice(&data, cols, row));
372                }
373                Ok(entries)
374            }
375        }
376        Value::Cell(cell) => {
377            let CellArray { data, .. } = cell.clone();
378            let mut entries = Vec::with_capacity(data.len());
379            for element in data {
380                match &*element {
381                    Value::String(text) => entries.push(text.clone()),
382                    Value::StringArray(sa) if sa.data.len() == 1 => {
383                        entries.push(sa.data[0].clone());
384                    }
385                    Value::CharArray(ca) if ca.rows <= 1 => {
386                        if ca.rows == 0 {
387                            entries.push(String::new());
388                        } else {
389                            entries.push(char_row_to_string_slice(&ca.data, ca.cols, 0));
390                        }
391                    }
392                    Value::CharArray(_) => return Err(CELL_ELEMENT_ERROR.to_string()),
393                    _ => return Err(CELL_ELEMENT_ERROR.to_string()),
394                }
395            }
396            Ok(entries)
397        }
398        _ => Err(type_error.to_string()),
399    }
400}
401
402struct ReplacementSpec {
403    pairs: Vec<(String, String)>,
404}
405
406impl ReplacementSpec {
407    fn from_values(old: &Value, new: &Value) -> Result<Self, String> {
408        let patterns = extract_pattern_list(old)?;
409        if patterns.is_empty() {
410            return Err(EMPTY_PATTERN_ERROR.to_string());
411        }
412
413        let replacements = extract_replacement_list(new)?;
414        if replacements.is_empty() {
415            return Err(EMPTY_REPLACEMENT_ERROR.to_string());
416        }
417
418        let pairs = if replacements.len() == patterns.len() {
419            patterns.into_iter().zip(replacements).collect::<Vec<_>>()
420        } else if replacements.len() == 1 {
421            let replacement = replacements[0].clone();
422            patterns
423                .into_iter()
424                .map(|pattern| (pattern, replacement.clone()))
425                .collect::<Vec<_>>()
426        } else {
427            return Err(SIZE_MISMATCH_ERROR.to_string());
428        };
429
430        Ok(Self { pairs })
431    }
432
433    fn apply(&self, input: &str) -> String {
434        let mut current = input.to_string();
435        for (pattern, replacement) in &self.pairs {
436            if pattern.is_empty() && replacement.is_empty() {
437                continue;
438            }
439            if pattern == replacement {
440                continue;
441            }
442            current = current.replace(pattern, replacement);
443        }
444        current
445    }
446}
447
448#[cfg(test)]
449mod tests {
450    use super::*;
451    #[cfg(feature = "doc_export")]
452    use crate::builtins::common::test_support;
453
454    #[test]
455    fn replace_string_scalar_single_term() {
456        let result = replace_builtin(
457            Value::String("RunMat runtime".into()),
458            Value::String("runtime".into()),
459            Value::String("engine".into()),
460        )
461        .expect("replace");
462        assert_eq!(result, Value::String("RunMat engine".into()));
463    }
464
465    #[test]
466    fn replace_string_array_multiple_terms() {
467        let strings = StringArray::new(
468            vec!["gpu".into(), "cpu".into(), "<missing>".into()],
469            vec![3, 1],
470        )
471        .unwrap();
472        let result = replace_builtin(
473            Value::StringArray(strings),
474            Value::StringArray(
475                StringArray::new(vec!["gpu".into(), "cpu".into()], vec![2, 1]).unwrap(),
476            ),
477            Value::String("device".into()),
478        )
479        .expect("replace");
480        match result {
481            Value::StringArray(sa) => {
482                assert_eq!(sa.shape, vec![3, 1]);
483                assert_eq!(
484                    sa.data,
485                    vec![
486                        String::from("device"),
487                        String::from("device"),
488                        String::from("<missing>")
489                    ]
490                );
491            }
492            other => panic!("expected string array, got {other:?}"),
493        }
494    }
495
496    #[test]
497    fn replace_char_array_adjusts_width() {
498        let chars = CharArray::new("matrix".chars().collect(), 1, 6).unwrap();
499        let result = replace_builtin(
500            Value::CharArray(chars),
501            Value::String("matrix".into()),
502            Value::String("tensor".into()),
503        )
504        .expect("replace");
505        match result {
506            Value::CharArray(out) => {
507                assert_eq!(out.rows, 1);
508                assert_eq!(out.cols, 6);
509                let expected: Vec<char> = "tensor".chars().collect();
510                assert_eq!(out.data, expected);
511            }
512            other => panic!("expected char array, got {other:?}"),
513        }
514    }
515
516    #[test]
517    fn replace_char_array_handles_padding() {
518        let chars = CharArray::new(vec!['a', 'b', 'c', 'd'], 2, 2).unwrap();
519        let result = replace_builtin(
520            Value::CharArray(chars),
521            Value::String("b".into()),
522            Value::String("beta".into()),
523        )
524        .expect("replace");
525        match result {
526            Value::CharArray(out) => {
527                assert_eq!(out.rows, 2);
528                assert_eq!(out.cols, 5);
529                let expected: Vec<char> = vec!['a', 'b', 'e', 't', 'a', 'c', 'd', ' ', ' ', ' '];
530                assert_eq!(out.data, expected);
531            }
532            other => panic!("expected char array, got {other:?}"),
533        }
534    }
535
536    #[test]
537    fn replace_cell_array_mixed_content() {
538        let cell = CellArray::new(
539            vec![
540                Value::CharArray(CharArray::new_row("Kernel Planner")),
541                Value::String("GPU Fusion".into()),
542            ],
543            1,
544            2,
545        )
546        .unwrap();
547        let result = replace_builtin(
548            Value::Cell(cell),
549            Value::Cell(
550                CellArray::new(
551                    vec![Value::String("Kernel".into()), Value::String("GPU".into())],
552                    1,
553                    2,
554                )
555                .unwrap(),
556            ),
557            Value::StringArray(
558                StringArray::new(vec!["Shader".into(), "Device".into()], vec![1, 2]).unwrap(),
559            ),
560        )
561        .expect("replace");
562        match result {
563            Value::Cell(out) => {
564                let first = out.get(0, 0).unwrap();
565                let second = out.get(0, 1).unwrap();
566                assert_eq!(
567                    first,
568                    Value::CharArray(CharArray::new_row("Shader Planner"))
569                );
570                assert_eq!(second, Value::String("Device Fusion".into()));
571            }
572            other => panic!("expected cell array, got {other:?}"),
573        }
574    }
575
576    #[test]
577    fn replace_errors_on_invalid_first_argument() {
578        let err = replace_builtin(
579            Value::Num(1.0),
580            Value::String("a".into()),
581            Value::String("b".into()),
582        )
583        .unwrap_err();
584        assert_eq!(err, ARG_TYPE_ERROR);
585    }
586
587    #[test]
588    fn replace_errors_on_invalid_pattern_type() {
589        let err = replace_builtin(
590            Value::String("abc".into()),
591            Value::Num(1.0),
592            Value::String("x".into()),
593        )
594        .unwrap_err();
595        assert_eq!(err, PATTERN_TYPE_ERROR);
596    }
597
598    #[test]
599    fn replace_errors_on_size_mismatch() {
600        let err = replace_builtin(
601            Value::String("abc".into()),
602            Value::StringArray(StringArray::new(vec!["a".into(), "b".into()], vec![2, 1]).unwrap()),
603            Value::StringArray(
604                StringArray::new(vec!["x".into(), "y".into(), "z".into()], vec![3, 1]).unwrap(),
605            ),
606        )
607        .unwrap_err();
608        assert_eq!(err, SIZE_MISMATCH_ERROR);
609    }
610
611    #[test]
612    fn replace_preserves_missing_string() {
613        let result = replace_builtin(
614            Value::String("<missing>".into()),
615            Value::String("missing".into()),
616            Value::String("value".into()),
617        )
618        .expect("replace");
619        assert_eq!(result, Value::String("<missing>".into()));
620    }
621
622    #[test]
623    #[cfg(feature = "doc_export")]
624    fn doc_examples_present() {
625        let blocks = test_support::doc_examples(DOC_MD);
626        assert!(!blocks.is_empty());
627    }
628}