runmat_runtime/builtins/strings/transform/
lower.rs

1//! MATLAB-compatible `lower` builtin with GPU-aware semantics for RunMat.
2
3use runmat_builtins::{CellArray, CharArray, StringArray, Value};
4use runmat_macros::runtime_builtin;
5
6use crate::builtins::common::spec::{
7    BroadcastSemantics, BuiltinFusionSpec, BuiltinGpuSpec, ConstantStrategy, GpuOpKind,
8    ReductionNaN, ResidencyPolicy, ShapeRequirements,
9};
10use crate::builtins::strings::common::{char_row_to_string_slice, lowercase_preserving_missing};
11#[cfg(feature = "doc_export")]
12use crate::register_builtin_doc_text;
13use crate::{gather_if_needed, make_cell, register_builtin_fusion_spec, register_builtin_gpu_spec};
14
15#[cfg(feature = "doc_export")]
16pub const DOC_MD: &str = r#"---
17title: "lower"
18category: "strings/transform"
19keywords: ["lower", "lowercase", "convert to lowercase", "string case", "character arrays"]
20summary: "Convert strings, character arrays, and cell arrays of character vectors to lowercase."
21references:
22  - https://www.mathworks.com/help/matlab/ref/lower.html
23gpu_support:
24  elementwise: false
25  reduction: false
26  precisions: []
27  broadcasting: "none"
28  notes: "Runs on the CPU; if any element lives on the GPU, RunMat gathers it before converting."
29fusion:
30  elementwise: false
31  reduction: false
32  max_inputs: 1
33  constants: "inline"
34requires_feature: null
35tested:
36  unit: "builtins::strings::transform::lower::tests"
37  integration: "builtins::strings::transform::lower::tests::lower_cell_array_mixed_content"
38---
39
40# What does the `lower` function do in MATLAB / RunMat?
41`lower(text)` converts every alphabetic character in `text` to lowercase. It accepts string scalars,
42string arrays, character arrays, and cell arrays of character vectors, mirroring MATLAB behaviour.
43Non-alphabetic characters are returned unchanged.
44
45## How does the `lower` function behave in MATLAB / RunMat?
46- String inputs stay as strings. String arrays preserve their size, orientation, and missing values.
47- Character arrays are processed row by row. The result remains a rectangular char array; if any row
48  grows after lowercasing (for example because `'İ'` expands), the array widens and shorter rows are padded with spaces.
49- Cell arrays must contain string scalars or character vectors. The result is a cell array of the same size
50  with each element converted to lowercase, and other types raise MATLAB-compatible errors.
51- Missing string scalars (`string(missing)`) remain missing and are returned as `<missing>`.
52- Inputs that are numeric, logical, structs, or GPU tensors raise MATLAB-compatible type errors.
53
54## `lower` Function GPU Execution Behaviour
55`lower` executes on the CPU. When the input (or any nested element) resides on the GPU, RunMat gathers it
56to host memory before performing the conversion so results remain identical to MATLAB. Providers do not
57need to implement custom kernels for this builtin.
58
59## GPU residency in RunMat (Do I need `gpuArray`?)
60RunMat automatically keeps string data on the host for now. If text originates from GPU-based computations
61(for example as numeric code points stored on the device), `lower` gathers those values before applying the
62transformation, so you never need to call `gpuArray` explicitly for this builtin.
63
64## Examples of using the `lower` function in MATLAB / RunMat
65
66### Convert A String Scalar To Lowercase
67```matlab
68txt = "RunMat";
69result = lower(txt);
70```
71Expected output:
72```matlab
73result = "runmat"
74```
75
76### Lowercase Each Element Of A String Array
77```matlab
78labels = ["NORTH" "South"; "EaSt" "WEST"];
79lowered = lower(labels);
80```
81Expected output:
82```matlab
83lowered = 2×2 string
84    "north"    "south"
85    "east"     "west"
86```
87
88### Lowercase Character Array Rows While Preserving Shape
89```matlab
90animals = char("CAT", "DOGE");
91result = lower(animals);
92```
93Expected output:
94```matlab
95result =
96
97  2×4 char array
98
99    'cat '
100    'doge'
101```
102
103### Lowercase A Cell Array Of Character Vectors
104```matlab
105C = {'HELLO', 'World'};
106out = lower(C);
107```
108Expected output:
109```matlab
110out = 1×2 cell array
111    {'hello'}    {'world'}
112```
113
114### Keep Missing Strings As Missing
115```matlab
116vals = string(["DATA" "<missing>" "GPU"]);
117converted = lower(vals);
118```
119Expected output:
120```matlab
121converted = 1×3 string
122    "data"    <missing>    "gpu"
123```
124
125### Lowercase Text Stored On A GPU Input
126```matlab
127codes = gpuArray(uint16('RUNMAT'));
128txt = char(gather(codes));
129result = lower(txt);
130```
131Expected output:
132```matlab
133result = 'runmat'
134```
135
136## FAQ
137
138### Does `lower` change non-alphabetic characters?
139No. Digits, punctuation, whitespace, and symbols remain untouched. Only alphabetic code points that have
140distinct lowercase forms are converted.
141
142### What happens to character array dimensions?
143RunMat lowers each row independently and pads with spaces when a lowercase mapping increases the row length.
144This mirrors MATLAB’s behaviour so the result always has rectangular dimensions.
145
146### Can I pass numeric arrays to `lower`?
147No. Passing numeric, logical, or struct inputs raises a MATLAB-compatible error. Convert the data to a string
148or character array first (for example with `string` or `char`).
149
150### How are missing strings handled?
151Missing string scalars remain `<missing>` and are returned unchanged. This matches MATLAB’s handling of
152missing values in text processing functions.
153
154### Will `lower` ever execute on the GPU?
155Not today. The builtin gathers GPU-resident data automatically and performs the conversion on the CPU so the
156results match MATLAB exactly. Providers may add device-side kernels in the future, but the behaviour will stay
157compatible.
158
159## See Also
160[upper](./upper), [string](../core/string), [char](../core/char), [regexprep](../regex/regexprep), [strcmpi](../search/strcmpi)
161
162## Source & Feedback
163- Implementation: [`crates/runmat-runtime/src/builtins/strings/transform/lower.rs`](https://github.com/runmat-org/runmat/blob/main/crates/runmat-runtime/src/builtins/strings/transform/lower.rs)
164- Found an issue? Please [open a GitHub issue](https://github.com/runmat-org/runmat/issues/new/choose) with a minimal reproduction.
165"#;
166
167pub const GPU_SPEC: BuiltinGpuSpec = BuiltinGpuSpec {
168    name: "lower",
169    op_kind: GpuOpKind::Custom("string-transform"),
170    supported_precisions: &[],
171    broadcast: BroadcastSemantics::None,
172    provider_hooks: &[],
173    constant_strategy: ConstantStrategy::InlineLiteral,
174    residency: ResidencyPolicy::GatherImmediately,
175    nan_mode: ReductionNaN::Include,
176    two_pass_threshold: None,
177    workgroup_size: None,
178    accepts_nan_mode: false,
179    notes:
180        "Executes on the CPU; GPU-resident inputs are gathered to host memory before conversion.",
181};
182
183register_builtin_gpu_spec!(GPU_SPEC);
184
185pub const FUSION_SPEC: BuiltinFusionSpec = BuiltinFusionSpec {
186    name: "lower",
187    shape: ShapeRequirements::Any,
188    constant_strategy: ConstantStrategy::InlineLiteral,
189    elementwise: None,
190    reduction: None,
191    emits_nan: false,
192    notes: "String transformation builtin; not eligible for fusion and always gathers GPU inputs.",
193};
194
195register_builtin_fusion_spec!(FUSION_SPEC);
196
197#[cfg(feature = "doc_export")]
198register_builtin_doc_text!("lower", DOC_MD);
199
200const ARG_TYPE_ERROR: &str =
201    "lower: first argument must be a string array, character array, or cell array of character vectors";
202const CELL_ELEMENT_ERROR: &str =
203    "lower: cell array elements must be string scalars or character vectors";
204
205#[runtime_builtin(
206    name = "lower",
207    category = "strings/transform",
208    summary = "Convert strings, character arrays, and cell arrays of character vectors to lowercase.",
209    keywords = "lower,lowercase,strings,character array,text",
210    accel = "sink"
211)]
212fn lower_builtin(value: Value) -> Result<Value, String> {
213    let gathered = gather_if_needed(&value).map_err(|e| format!("lower: {e}"))?;
214    match gathered {
215        Value::String(text) => Ok(Value::String(lowercase_preserving_missing(text))),
216        Value::StringArray(array) => lower_string_array(array),
217        Value::CharArray(array) => lower_char_array(array),
218        Value::Cell(cell) => lower_cell_array(cell),
219        _ => Err(ARG_TYPE_ERROR.to_string()),
220    }
221}
222
223fn lower_string_array(array: StringArray) -> Result<Value, String> {
224    let StringArray { data, shape, .. } = array;
225    let lowered = data
226        .into_iter()
227        .map(lowercase_preserving_missing)
228        .collect::<Vec<_>>();
229    let lowered_array = StringArray::new(lowered, shape).map_err(|e| format!("lower: {e}"))?;
230    Ok(Value::StringArray(lowered_array))
231}
232
233fn lower_char_array(array: CharArray) -> Result<Value, String> {
234    let CharArray { data, rows, cols } = array;
235    if rows == 0 || cols == 0 {
236        return Ok(Value::CharArray(CharArray { data, rows, cols }));
237    }
238
239    let mut lowered_rows = Vec::with_capacity(rows);
240    let mut target_cols = cols;
241    for row in 0..rows {
242        let text = char_row_to_string_slice(&data, cols, row).to_lowercase();
243        let len = text.chars().count();
244        target_cols = target_cols.max(len);
245        lowered_rows.push(text);
246    }
247
248    let mut lowered_data = Vec::with_capacity(rows * target_cols);
249    for row_text in lowered_rows {
250        let mut chars: Vec<char> = row_text.chars().collect();
251        if chars.len() < target_cols {
252            chars.resize(target_cols, ' ');
253        }
254        lowered_data.extend(chars.into_iter());
255    }
256
257    CharArray::new(lowered_data, rows, target_cols)
258        .map(Value::CharArray)
259        .map_err(|e| format!("lower: {e}"))
260}
261
262fn lower_cell_array(cell: CellArray) -> Result<Value, String> {
263    let CellArray {
264        data, rows, cols, ..
265    } = cell;
266    let mut lowered_values = Vec::with_capacity(rows * cols);
267    for row in 0..rows {
268        for col in 0..cols {
269            let idx = row * cols + col;
270            let lowered = lower_cell_element(&data[idx])?;
271            lowered_values.push(lowered);
272        }
273    }
274    make_cell(lowered_values, rows, cols).map_err(|e| format!("lower: {e}"))
275}
276
277fn lower_cell_element(value: &Value) -> Result<Value, String> {
278    match value {
279        Value::String(text) => Ok(Value::String(lowercase_preserving_missing(text.clone()))),
280        Value::StringArray(sa) if sa.data.len() == 1 => Ok(Value::String(
281            lowercase_preserving_missing(sa.data[0].clone()),
282        )),
283        Value::CharArray(ca) if ca.rows <= 1 => lower_char_array(ca.clone()),
284        Value::CharArray(_) => Err(CELL_ELEMENT_ERROR.to_string()),
285        _ => Err(CELL_ELEMENT_ERROR.to_string()),
286    }
287}
288
289#[cfg(test)]
290mod tests {
291    use super::*;
292    #[cfg(feature = "doc_export")]
293    use crate::builtins::common::test_support;
294
295    #[test]
296    fn lower_string_scalar_value() {
297        let result = lower_builtin(Value::String("RunMat".into())).expect("lower");
298        assert_eq!(result, Value::String("runmat".into()));
299    }
300
301    #[test]
302    fn lower_string_array_preserves_shape() {
303        let array = StringArray::new(
304            vec![
305                "GPU".into(),
306                "ACCEL".into(),
307                "<missing>".into(),
308                "MiXeD".into(),
309            ],
310            vec![2, 2],
311        )
312        .unwrap();
313        let result = lower_builtin(Value::StringArray(array)).expect("lower");
314        match result {
315            Value::StringArray(sa) => {
316                assert_eq!(sa.shape, vec![2, 2]);
317                assert_eq!(
318                    sa.data,
319                    vec![
320                        String::from("gpu"),
321                        String::from("accel"),
322                        String::from("<missing>"),
323                        String::from("mixed")
324                    ]
325                );
326            }
327            other => panic!("expected string array, got {other:?}"),
328        }
329    }
330
331    #[test]
332    fn lower_char_array_multiple_rows() {
333        let data: Vec<char> = vec!['C', 'A', 'T', 'D', 'O', 'G'];
334        let array = CharArray::new(data, 2, 3).unwrap();
335        let result = lower_builtin(Value::CharArray(array)).expect("lower");
336        match result {
337            Value::CharArray(ca) => {
338                assert_eq!(ca.rows, 2);
339                assert_eq!(ca.cols, 3);
340                assert_eq!(ca.data, vec!['c', 'a', 't', 'd', 'o', 'g']);
341            }
342            other => panic!("expected char array, got {other:?}"),
343        }
344    }
345
346    #[test]
347    fn lower_char_vector_handles_padding() {
348        let array = CharArray::new_row("HELLO ");
349        let result = lower_builtin(Value::CharArray(array)).expect("lower");
350        match result {
351            Value::CharArray(ca) => {
352                assert_eq!(ca.rows, 1);
353                assert_eq!(ca.cols, 6);
354                let expected: Vec<char> = "hello ".chars().collect();
355                assert_eq!(ca.data, expected);
356            }
357            other => panic!("expected char array, got {other:?}"),
358        }
359    }
360
361    #[test]
362    fn lower_char_array_unicode_expansion_extends_width() {
363        let data: Vec<char> = vec!['İ', 'A'];
364        let array = CharArray::new(data, 1, 2).unwrap();
365        let result = lower_builtin(Value::CharArray(array)).expect("lower");
366        match result {
367            Value::CharArray(ca) => {
368                assert_eq!(ca.rows, 1);
369                assert_eq!(ca.cols, 3);
370                let expected: Vec<char> = vec!['i', '\u{307}', 'a'];
371                assert_eq!(ca.data, expected);
372            }
373            other => panic!("expected char array, got {other:?}"),
374        }
375    }
376
377    #[test]
378    fn lower_cell_array_mixed_content() {
379        let cell = CellArray::new(
380            vec![
381                Value::CharArray(CharArray::new_row("RUN")),
382                Value::String("Mat".into()),
383            ],
384            1,
385            2,
386        )
387        .unwrap();
388        let result = lower_builtin(Value::Cell(cell)).expect("lower");
389        match result {
390            Value::Cell(out) => {
391                let first = out.get(0, 0).unwrap();
392                let second = out.get(0, 1).unwrap();
393                assert_eq!(first, Value::CharArray(CharArray::new_row("run")));
394                assert_eq!(second, Value::String("mat".into()));
395            }
396            other => panic!("expected cell array, got {other:?}"),
397        }
398    }
399
400    #[test]
401    fn lower_errors_on_invalid_input() {
402        let err = lower_builtin(Value::Num(1.0)).unwrap_err();
403        assert_eq!(err, ARG_TYPE_ERROR);
404    }
405
406    #[test]
407    fn lower_cell_errors_on_invalid_element() {
408        let cell = CellArray::new(vec![Value::Num(1.0)], 1, 1).unwrap();
409        let err = lower_builtin(Value::Cell(cell)).unwrap_err();
410        assert_eq!(err, CELL_ELEMENT_ERROR);
411    }
412
413    #[test]
414    fn lower_preserves_missing_string() {
415        let result = lower_builtin(Value::String("<missing>".into())).expect("lower");
416        assert_eq!(result, Value::String("<missing>".into()));
417    }
418
419    #[test]
420    fn lower_cell_allows_empty_char_vector() {
421        let empty_char = CharArray::new(Vec::new(), 1, 0).unwrap();
422        let cell = CellArray::new(vec![Value::CharArray(empty_char.clone())], 1, 1).unwrap();
423        let result = lower_builtin(Value::Cell(cell)).expect("lower");
424        match result {
425            Value::Cell(out) => {
426                let element = out.get(0, 0).unwrap();
427                assert_eq!(element, Value::CharArray(empty_char));
428            }
429            other => panic!("expected cell array, got {other:?}"),
430        }
431    }
432
433    #[test]
434    #[cfg(feature = "wgpu")]
435    fn lower_gpu_tensor_input_gathers_then_errors() {
436        let _ = runmat_accelerate::backend::wgpu::provider::register_wgpu_provider(
437            runmat_accelerate::backend::wgpu::provider::WgpuProviderOptions::default(),
438        );
439        let provider = runmat_accelerate_api::provider().expect("wgpu provider");
440        let data = [1.0f64, 2.0];
441        let shape = [2usize, 1usize];
442        let handle = provider
443            .upload(&runmat_accelerate_api::HostTensorView {
444                data: &data,
445                shape: &shape,
446            })
447            .expect("upload");
448        let err = lower_builtin(Value::GpuTensor(handle.clone())).unwrap_err();
449        assert_eq!(err, ARG_TYPE_ERROR);
450        provider.free(&handle).ok();
451    }
452
453    #[test]
454    #[cfg(feature = "doc_export")]
455    fn doc_examples_present() {
456        let blocks = test_support::doc_examples(DOC_MD);
457        assert!(!blocks.is_empty());
458    }
459}