runmat_runtime/builtins/strings/core/
strncmp.rs

1//! MATLAB-compatible `strncmp` builtin for RunMat.
2
3use runmat_builtins::Value;
4use runmat_macros::runtime_builtin;
5
6use crate::builtins::common::broadcast::{broadcast_index, broadcast_shapes, compute_strides};
7use crate::builtins::common::spec::{
8    BroadcastSemantics, BuiltinFusionSpec, BuiltinGpuSpec, ConstantStrategy, GpuOpKind,
9    ReductionNaN, ResidencyPolicy, ShapeRequirements,
10};
11use crate::builtins::common::tensor;
12use crate::builtins::strings::search::text_utils::{logical_result, TextCollection, TextElement};
13#[cfg(feature = "doc_export")]
14use crate::register_builtin_doc_text;
15use crate::{gather_if_needed, register_builtin_fusion_spec, register_builtin_gpu_spec};
16
17const FN_NAME: &str = "strncmp";
18
19#[cfg(feature = "doc_export")]
20pub const DOC_MD: &str = r#"---
21title: "strncmp"
22category: "strings/core"
23keywords: ["strncmp", "string compare", "prefix", "text equality", "case-sensitive"]
24summary: "Compare text inputs for equality up to N leading characters with MATLAB-compatible broadcasting."
25references:
26  - https://www.mathworks.com/help/matlab/ref/strncmp.html
27gpu_support:
28  elementwise: false
29  reduction: false
30  precisions: []
31  broadcasting: "matlab"
32  notes: "Executes on the CPU. GPU-resident inputs are gathered automatically so prefix comparisons match MATLAB exactly."
33fusion:
34  elementwise: false
35  reduction: false
36  max_inputs: 3
37  constants: "inline"
38requires_feature: null
39tested:
40  unit: "builtins::strings::core::strncmp::tests"
41  integration: "builtins::strings::core::strncmp::tests::strncmp_char_array_rows"
42---
43
44# What does the `strncmp` function do in MATLAB / RunMat?
45`strncmp(A, B, N)` compares text values element-wise and returns logical `true` when the first `N`
46characters of the corresponding elements are identical. Comparisons are case-sensitive and respect
47MATLAB's implicit expansion rules across strings, character arrays, and cell arrays of character vectors.
48
49## How does the `strncmp` function behave in MATLAB / RunMat?
50- **Accepted text types**: String scalars/arrays, character vectors or character arrays, and cell arrays of character vectors.
51- **Scalar `N` requirement**: The third argument must evaluate to a finite, nonnegative integer scalar. Numeric, logical, and scalar tensor/logical-array values are accepted when they convert cleanly.
52- **Implicit expansion**: Scalars expand to match the size of the other operand before comparison.
53- **Character arrays**: Rows are treated as independent character vectors. Each row is compared against the other operand and the result is returned as a column vector.
54- **Unicode-aware comparisons**: Prefixes are counted in MATLAB characters (Unicode scalar values), so multi-byte UTF-8 sequences are handled transparently.
55- **Prefix length semantics**: If `N` is `0`, every comparison evaluates to `true`. If either text element is shorter than `N`, it must match exactly up to the end of the shorter value and the longer value must also end within `N` characters to be considered equal.
56- **Missing strings**: Any comparison involving a missing string returns `false` unless `N == 0`.
57- **Result type**: Scalar comparisons return logical scalars. Array comparisons return logical arrays that follow MATLAB's column-major ordering.
58
59## `strncmp` Function GPU Execution Behaviour
60`strncmp` is registered as an acceleration **sink**. When any operand resides on the GPU, RunMat gathers
61all inputs back to host memory before performing the comparison so that the behaviour matches MATLAB
62exactly. The logical result is always returned on the host. Providers do not need to supply specialised kernels.
63
64## Examples of using the `strncmp` function in MATLAB / RunMat
65
66### Checking whether two strings share a prefix
67```matlab
68tf = strncmp("RunMat", "Runway", 3);
69```
70Expected output:
71```matlab
72tf = logical
73   1
74```
75
76### Comparing string arrays with implicit expansion
77```matlab
78names = ["north" "south" "east"];
79tf = strncmp(names, "no", 2);
80```
81Expected output:
82```matlab
83tf = 1×3 logical array
84   1   0   0
85```
86
87### Comparing rows of a character array
88```matlab
89animals = char("cat", "camel", "cow");
90tf = strncmp(animals, "ca", 2);
91```
92Expected output:
93```matlab
94tf = 3×1 logical array
95   1
96   1
97   0
98```
99
100### Comparing cell arrays element-wise
101```matlab
102C1 = {'red', 'green', 'blue'};
103C2 = {'rose', 'grey', 'black'};
104tf = strncmp(C1, C2, 2);
105```
106Expected output:
107```matlab
108tf = 1×3 logical array
109   1   0   0
110```
111
112### Handling zero-length comparisons
113```matlab
114tf = strncmp("alpha", "omega", 0);
115```
116Expected output:
117```matlab
118tf = logical
119   1
120```
121
122## GPU residency in RunMat (Do I need `gpuArray`?)
123No. If you pass GPU-resident data, RunMat automatically gathers it to host memory before running `strncmp`.
124The builtin is an acceleration sink and always returns host logical outputs. Explicit `gpuArray` / `gather`
125calls are only required for compatibility with legacy MATLAB workflows.
126
127## FAQ
128
129### What argument types does `strncmp` accept?
130String arrays, character vectors/arrays, and cell arrays of character vectors. Mixed combinations are converted automatically. The third argument `N` must be a nonnegative integer scalar.
131
132### Is the comparison case-sensitive?
133Yes. Use `strncmpi` if you need a case-insensitive prefix comparison.
134
135### What happens when `N` is zero?
136The builtin returns `true` for every element because zero leading characters are compared.
137
138### How are shorter strings handled when `N` is larger than their length?
139The shorter value must match the longer value exactly for its entire length, and the longer value must not have additional characters within the first `N` positions. Otherwise the comparison returns `false`.
140
141### How are missing string values treated?
142Any comparison that involves a missing string returns `false`, except when `N` is zero (because no characters are compared).
143
144### Does `strncmp` produce logical results?
145Yes. Scalar comparisons yield logical scalars; array inputs produce logical arrays that follow MATLAB’s column-major ordering.
146
147## See Also
148[strcmp](./strcmp), [strcmpi](./strcmpi), [contains](../../search/contains), [startswith](../../search/startswith), [strlength](./strlength)
149
150## Source & Feedback
151- Implementation: [`crates/runmat-runtime/src/builtins/strings/core/strncmp.rs`](https://github.com/runmat-org/runmat/blob/main/crates/runmat-runtime/src/builtins/strings/core/strncmp.rs)
152- Found a bug? Please [open an issue](https://github.com/runmat-org/runmat/issues/new/choose) with a minimal reproduction.
153"#;
154
155pub const GPU_SPEC: BuiltinGpuSpec = BuiltinGpuSpec {
156    name: "strncmp",
157    op_kind: GpuOpKind::Custom("string-prefix-compare"),
158    supported_precisions: &[],
159    broadcast: BroadcastSemantics::Matlab,
160    provider_hooks: &[],
161    constant_strategy: ConstantStrategy::InlineLiteral,
162    residency: ResidencyPolicy::GatherImmediately,
163    nan_mode: ReductionNaN::Include,
164    two_pass_threshold: None,
165    workgroup_size: None,
166    accepts_nan_mode: false,
167    notes: "Performs host-side prefix comparisons; GPU inputs are gathered before evaluation.",
168};
169
170register_builtin_gpu_spec!(GPU_SPEC);
171
172pub const FUSION_SPEC: BuiltinFusionSpec = BuiltinFusionSpec {
173    name: "strncmp",
174    shape: ShapeRequirements::Any,
175    constant_strategy: ConstantStrategy::InlineLiteral,
176    elementwise: None,
177    reduction: None,
178    emits_nan: false,
179    notes: "Produces logical host results and is not eligible for GPU fusion.",
180};
181
182register_builtin_fusion_spec!(FUSION_SPEC);
183
184#[cfg(feature = "doc_export")]
185register_builtin_doc_text!("strncmp", DOC_MD);
186
187#[runtime_builtin(
188    name = "strncmp",
189    category = "strings/core",
190    summary = "Compare text inputs for equality up to N leading characters (case-sensitive).",
191    keywords = "strncmp,string compare,prefix,text equality",
192    accel = "sink"
193)]
194fn strncmp_builtin(a: Value, b: Value, n: Value) -> Result<Value, String> {
195    let a = gather_if_needed(&a).map_err(|e| format!("{FN_NAME}: {e}"))?;
196    let b = gather_if_needed(&b).map_err(|e| format!("{FN_NAME}: {e}"))?;
197    let n = gather_if_needed(&n).map_err(|e| format!("{FN_NAME}: {e}"))?;
198
199    let limit = parse_prefix_length(n)?;
200    let left = TextCollection::from_argument(FN_NAME, a, "first argument")?;
201    let right = TextCollection::from_argument(FN_NAME, b, "second argument")?;
202    evaluate_strncmp(&left, &right, limit)
203}
204
205fn evaluate_strncmp(
206    left: &TextCollection,
207    right: &TextCollection,
208    limit: usize,
209) -> Result<Value, String> {
210    let shape = broadcast_shapes(FN_NAME, &left.shape, &right.shape)?;
211    let total = tensor::element_count(&shape);
212    if total == 0 {
213        return logical_result(FN_NAME, Vec::new(), shape);
214    }
215
216    let left_strides = compute_strides(&left.shape);
217    let right_strides = compute_strides(&right.shape);
218    let mut data = Vec::with_capacity(total);
219
220    for linear in 0..total {
221        let li = broadcast_index(linear, &shape, &left.shape, &left_strides);
222        let ri = broadcast_index(linear, &shape, &right.shape, &right_strides);
223        let equal = if limit == 0 {
224            true
225        } else {
226            match (&left.elements[li], &right.elements[ri]) {
227                (TextElement::Missing, _) | (_, TextElement::Missing) => false,
228                (TextElement::Text(lhs), TextElement::Text(rhs)) => prefix_equal(lhs, rhs, limit),
229            }
230        };
231        data.push(if equal { 1 } else { 0 });
232    }
233
234    logical_result(FN_NAME, data, shape)
235}
236
237fn prefix_equal(lhs: &str, rhs: &str, limit: usize) -> bool {
238    if limit == 0 {
239        return true;
240    }
241    let mut lhs_iter = lhs.chars();
242    let mut rhs_iter = rhs.chars();
243    let mut compared = 0usize;
244
245    while compared < limit {
246        let left_char = lhs_iter.next();
247        let right_char = rhs_iter.next();
248        match (left_char, right_char) {
249            (Some(lc), Some(rc)) => {
250                if lc != rc {
251                    return false;
252                }
253            }
254            (None, Some(_)) | (Some(_), None) => {
255                return false;
256            }
257            (None, None) => {
258                return true;
259            }
260        }
261        compared += 1;
262    }
263
264    true
265}
266
267fn parse_prefix_length(value: Value) -> Result<usize, String> {
268    match value {
269        Value::Int(i) => {
270            let raw = i.to_i64();
271            if raw < 0 {
272                return Err(format!(
273                    "{FN_NAME}: prefix length must be a nonnegative integer"
274                ));
275            }
276            Ok(raw as usize)
277        }
278        Value::Num(n) => parse_prefix_length_from_float(n),
279        Value::Bool(b) => Ok(if b { 1 } else { 0 }),
280        Value::Tensor(tensor) => {
281            if tensor.data.len() != 1 {
282                return Err(format!(
283                    "{FN_NAME}: prefix length must be a nonnegative integer scalar"
284                ));
285            }
286            parse_prefix_length_from_float(tensor.data[0])
287        }
288        Value::LogicalArray(array) => {
289            if array.data.len() != 1 {
290                return Err(format!(
291                    "{FN_NAME}: prefix length must be a nonnegative integer scalar"
292                ));
293            }
294            Ok(if array.data[0] != 0 { 1 } else { 0 })
295        }
296        other => Err(format!(
297            "{FN_NAME}: prefix length must be a nonnegative integer scalar, received {other:?}"
298        )),
299    }
300}
301
302fn parse_prefix_length_from_float(value: f64) -> Result<usize, String> {
303    if !value.is_finite() {
304        return Err(format!(
305            "{FN_NAME}: prefix length must be a finite nonnegative integer"
306        ));
307    }
308    if value < 0.0 {
309        return Err(format!(
310            "{FN_NAME}: prefix length must be a nonnegative integer"
311        ));
312    }
313    let rounded = value.round();
314    if (rounded - value).abs() > f64::EPSILON {
315        return Err(format!(
316            "{FN_NAME}: prefix length must be a nonnegative integer"
317        ));
318    }
319    if rounded > (usize::MAX as f64) {
320        return Err(format!(
321            "{FN_NAME}: prefix length exceeds the maximum supported size"
322        ));
323    }
324    Ok(rounded as usize)
325}
326
327#[cfg(test)]
328mod tests {
329    use super::*;
330    #[cfg(feature = "doc_export")]
331    use crate::builtins::common::test_support;
332    #[cfg(feature = "wgpu")]
333    use runmat_accelerate_api::AccelProvider;
334    use runmat_builtins::{CellArray, CharArray, IntValue, LogicalArray, StringArray, Tensor};
335
336    #[test]
337    fn strncmp_exact_prefix_true() {
338        let result = strncmp_builtin(
339            Value::String("RunMat".into()),
340            Value::String("Runway".into()),
341            Value::Int(IntValue::I32(3)),
342        )
343        .expect("strncmp");
344        assert_eq!(result, Value::Bool(true));
345    }
346
347    #[test]
348    fn strncmp_mismatch_within_prefix_false() {
349        let result = strncmp_builtin(
350            Value::String("RunMat".into()),
351            Value::String("Runway".into()),
352            Value::Int(IntValue::I32(4)),
353        )
354        .expect("strncmp");
355        assert_eq!(result, Value::Bool(false));
356    }
357
358    #[test]
359    fn strncmp_longer_string_after_prefix_false() {
360        let result = strncmp_builtin(
361            Value::String("cat".into()),
362            Value::String("cater".into()),
363            Value::Int(IntValue::I32(4)),
364        )
365        .expect("strncmp");
366        assert_eq!(result, Value::Bool(false));
367    }
368
369    #[test]
370    fn strncmp_zero_length_always_true() {
371        let result = strncmp_builtin(
372            Value::String("alpha".into()),
373            Value::String("omega".into()),
374            Value::Num(0.0),
375        )
376        .expect("strncmp");
377        assert_eq!(result, Value::Bool(true));
378    }
379
380    #[test]
381    fn strncmp_prefix_length_bool_true_compares_first_character() {
382        let result = strncmp_builtin(
383            Value::String("alpha".into()),
384            Value::String("array".into()),
385            Value::Bool(true),
386        )
387        .expect("strncmp");
388        assert_eq!(result, Value::Bool(true));
389    }
390
391    #[test]
392    fn strncmp_prefix_length_bool_false_treated_as_zero() {
393        let result = strncmp_builtin(
394            Value::String("alpha".into()),
395            Value::String("omega".into()),
396            Value::Bool(false),
397        )
398        .expect("strncmp");
399        assert_eq!(result, Value::Bool(true));
400    }
401
402    #[test]
403    fn strncmp_prefix_length_logical_array_scalar() {
404        let logical = LogicalArray::new(vec![1], vec![1]).unwrap();
405        let result = strncmp_builtin(
406            Value::String("beta".into()),
407            Value::String("theta".into()),
408            Value::LogicalArray(logical),
409        )
410        .expect("strncmp");
411        assert_eq!(result, Value::Bool(false));
412    }
413
414    #[test]
415    fn strncmp_prefix_length_tensor_scalar_double() {
416        let limit = Tensor::new(vec![2.0], vec![1, 1]).unwrap();
417        let result = strncmp_builtin(
418            Value::String("gamma".into()),
419            Value::String("gamut".into()),
420            Value::Tensor(limit),
421        )
422        .expect("strncmp");
423        assert_eq!(result, Value::Bool(true));
424    }
425
426    #[test]
427    fn strncmp_char_array_rows() {
428        let chars = CharArray::new(
429            vec![
430                'c', 'a', 't', ' ', ' ', 'c', 'a', 'm', 'e', 'l', 'c', 'o', 'w', ' ', ' ',
431            ],
432            3,
433            5,
434        )
435        .unwrap();
436        let result = strncmp_builtin(
437            Value::CharArray(chars),
438            Value::String("ca".into()),
439            Value::Int(IntValue::I32(2)),
440        )
441        .expect("strncmp");
442        let expected = LogicalArray::new(vec![1, 1, 0], vec![3, 1]).unwrap();
443        assert_eq!(result, Value::LogicalArray(expected));
444    }
445
446    #[test]
447    fn strncmp_cell_arrays_broadcast() {
448        let left = CellArray::new(
449            vec![
450                Value::from("red"),
451                Value::from("green"),
452                Value::from("blue"),
453            ],
454            1,
455            3,
456        )
457        .unwrap();
458        let right = CellArray::new(
459            vec![
460                Value::from("rose"),
461                Value::from("gray"),
462                Value::from("black"),
463            ],
464            1,
465            3,
466        )
467        .unwrap();
468        let result = strncmp_builtin(
469            Value::Cell(left),
470            Value::Cell(right),
471            Value::Int(IntValue::I32(2)),
472        )
473        .expect("strncmp");
474        let expected = LogicalArray::new(vec![0, 1, 1], vec![1, 3]).unwrap();
475        assert_eq!(result, Value::LogicalArray(expected));
476    }
477
478    #[test]
479    fn strncmp_string_array_broadcast_scalar() {
480        let strings = StringArray::new(
481            vec!["north".into(), "south".into(), "east".into()],
482            vec![1, 3],
483        )
484        .unwrap();
485        let result = strncmp_builtin(
486            Value::StringArray(strings),
487            Value::String("no".into()),
488            Value::Int(IntValue::I32(2)),
489        )
490        .expect("strncmp");
491        let expected = LogicalArray::new(vec![1, 0, 0], vec![1, 3]).unwrap();
492        assert_eq!(result, Value::LogicalArray(expected));
493    }
494
495    #[test]
496    fn strncmp_missing_string_false_when_prefix_positive() {
497        let strings =
498            StringArray::new(vec!["<missing>".into(), "value".into()], vec![1, 2]).unwrap();
499        let result = strncmp_builtin(
500            Value::StringArray(strings),
501            Value::String("val".into()),
502            Value::Int(IntValue::I32(3)),
503        )
504        .expect("strncmp");
505        let expected = LogicalArray::new(vec![0, 1], vec![1, 2]).unwrap();
506        assert_eq!(result, Value::LogicalArray(expected));
507    }
508
509    #[test]
510    fn strncmp_missing_zero_length_true() {
511        let strings = StringArray::new(vec!["<missing>".into()], vec![1, 1]).unwrap();
512        let result = strncmp_builtin(
513            Value::StringArray(strings),
514            Value::String("anything".into()),
515            Value::Int(IntValue::I32(0)),
516        )
517        .expect("strncmp");
518        assert_eq!(result, Value::Bool(true));
519    }
520
521    #[test]
522    fn strncmp_size_mismatch_error() {
523        let left = StringArray::new(vec!["a".into(), "b".into()], vec![2, 1]).unwrap();
524        let right = StringArray::new(vec!["a".into(), "b".into(), "c".into()], vec![3, 1]).unwrap();
525        let err = strncmp_builtin(
526            Value::StringArray(left),
527            Value::StringArray(right),
528            Value::Int(IntValue::I32(1)),
529        )
530        .expect_err("size mismatch");
531        assert!(err.contains("size mismatch"));
532    }
533
534    #[test]
535    fn strncmp_invalid_length_type_errors() {
536        let err = strncmp_builtin(
537            Value::String("abc".into()),
538            Value::String("abc".into()),
539            Value::String("3".into()),
540        )
541        .expect_err("invalid prefix length");
542        assert!(err.contains("prefix length"));
543    }
544
545    #[test]
546    fn strncmp_negative_length_errors() {
547        let err = strncmp_builtin(
548            Value::String("abc".into()),
549            Value::String("abc".into()),
550            Value::Num(-1.0),
551        )
552        .expect_err("negative length");
553        assert!(err.to_ascii_lowercase().contains("nonnegative"));
554    }
555
556    #[test]
557    #[cfg(feature = "wgpu")]
558    fn strncmp_prefix_length_from_gpu_tensor() {
559        use runmat_accelerate::backend::wgpu::provider::{
560            register_wgpu_provider, WgpuProviderOptions,
561        };
562        use runmat_accelerate_api::HostTensorView;
563
564        let provider = match register_wgpu_provider(WgpuProviderOptions::default()) {
565            Ok(provider) => provider,
566            Err(_) => return,
567        };
568        let tensor = Tensor::new(vec![3.0], vec![1, 1]).unwrap();
569        let view = HostTensorView {
570            data: &tensor.data,
571            shape: &tensor.shape,
572        };
573        let handle = provider.upload(&view).expect("upload prefix length to GPU");
574        let result = strncmp_builtin(
575            Value::String("delta".into()),
576            Value::String("deluge".into()),
577            Value::GpuTensor(handle.clone()),
578        )
579        .expect("strncmp");
580        assert_eq!(result, Value::Bool(true));
581        let _ = provider.free(&handle);
582    }
583
584    #[test]
585    #[cfg(feature = "doc_export")]
586    fn doc_examples_present() {
587        let blocks = test_support::doc_examples(DOC_MD);
588        assert!(!blocks.is_empty());
589    }
590}