runmat-runtime 0.4.1

{
  "title": "corrcoef",
  "category": "stats/summary",
  "keywords": [
    "corrcoef",
    "correlation",
    "statistics",
    "rows",
    "normalization",
    "gpu"
  ],
  "summary": "Compute Pearson correlation coefficients for the columns of matrices or paired data sets.",
  "references": [
    "https://www.mathworks.com/help/matlab/ref/corrcoef.html"
  ],
  "gpu_support": {
    "elementwise": false,
    "reduction": false,
    "precisions": [
      "f32",
      "f64"
    ],
    "broadcasting": "none",
    "notes": "Runs entirely on the GPU when rows='all' and the provider exposes the corrcoef hook; other modes transparently gather to host."
  },
  "fusion": {
    "elementwise": false,
    "reduction": false,
    "max_inputs": 2,
    "constants": "inline"
  },
  "requires_feature": null,
  "tested": {
    "unit": "builtins::stats::summary::corrcoef::tests",
    "integration": "builtins::stats::summary::corrcoef::tests::corrcoef_gpu_roundtrip",
    "gpu": "builtins::stats::summary::corrcoef::tests::corrcoef_wgpu_matches_cpu"
  },
  "description": "`corrcoef` returns the Pearson correlation coefficients between pairs of variables. Given a matrix, every column is treated as a separate variable and every row is an observation. You can also pass two matching data sets and RunMat will concatenate them column-wise, mirroring MATLAB.",
  "behaviors": [
    "`corrcoef(X)` returns a square matrix whose `(i, j)` entry is the correlation between column `i` and column `j` of `X`.",
    "`corrcoef(X, Y)` concatenates the columns of `X` and `Y` (they must share the same number of rows) before computing correlations.",
    "An optional normalisation flag allows you to divide by `N - 1` (default, unbiased) or `N` (`flag = 1`, biased) just like MATLAB.",
    "Use `'rows'` with `'all'`, `'complete'`, or `'pairwise'` to control how rows containing `NaN` or `Inf` are handled.",
    "Columns with zero variance produce `NaN` on both the diagonal and in every correlation that references them, matching MATLAB semantics."
  ],
  "examples": [
    {
      "description": "Calculating correlation between matrix columns",
      "input": "A = [1 2 4; 2 4 1; 3 6 -1; 4 8 0];\nR = corrcoef(A)",
      "output": "R =\n    1.0000    1.0000   -0.8367\n    1.0000    1.0000   -0.8367\n   -0.8367   -0.8367    1.0000"
    },
    {
      "description": "Correlating two separate data sets",
      "input": "height = [1.72 1.84 1.65 1.91]';\nweight = [68.5 83.0 59.1 92.2]';\nR = corrcoef(height, weight)",
      "output": "R =\n    1.0000    0.9998\n    0.9998    1.0000"
    },
    {
      "description": "Ignoring rows that contain missing values",
      "input": "X = [1 NaN  2;\n     3  4  5;\n     6  7 NaN];\nR = corrcoef(X, 'rows', 'complete')",
      "output": "R =\n   NaN   NaN   NaN\n   NaN   NaN   NaN\n   NaN   NaN   NaN"
    },
    {
      "description": "Pairwise correlation with staggered NaNs",
      "input": "X = [ 1  2  3;\n     NaN 5  1;\n      4 NaN 6;\n      5  8 NaN];\nR = corrcoef(X, 'rows', 'pairwise')",
      "output": "R =\n     1     1     1\n     1     1    -1\n     1    -1     1"
    },
    {
      "description": "Using biased normalisation (`flag = 1`)",
      "input": "A = [1 2; 3 4; 5 6];\nR = corrcoef(A, 1)",
      "output": "R =\n    1.0000    1.0000\n    1.0000    1.0000"
    },
    {
      "description": "Running `corrcoef` on `gpuArray` inputs",
      "input": "G = gpuArray([1 2 4; 2 4 1; 3 6 -1; 4 8 0]);\nR = corrcoef(G);\nR_host = gather(R)"
    }
  ],
  "faqs": [
    {
      "question": "Does `corrcoef` support two outputs like MATLAB?",
      "answer": "RunMat currently returns the correlation matrix (`R`) and omits the optional p-value output. The statistical distribution helpers needed for the p-values are on the roadmap."
    },
    {
      "question": "How does the normalisation flag affect the result?",
      "answer": "`flag = 0` (default) divides by `N - 1` for an unbiased estimate. `flag = 1` divides by `N`, producing biased estimates that match MATLAB. The choice does not change perfect correlations because both variance and covariance scale by the same factor."
    },
    {
      "question": "What happens when a column is constant?",
      "answer": "Columns with zero variance produce `NaN` on the diagonal and in any correlation that references that column. MATLAB behaves the same way because the standard deviation is zero."
    },
    {
      "question": "Which rows are removed by `'rows','complete'`?",
      "answer": "All rows that contain any `NaN` or `Inf` values are discarded before computing the correlation matrix. `'rows','pairwise'` performs this filtering separately for each column pair."
    },
    {
      "question": "Can I call `corrcoef` on logical arrays?",
      "answer": "Yes. Logical inputs are promoted to double precision (`true -> 1.0`, `false -> 0.0`) before the correlation matrix is computed."
    },
    {
      "question": "What does corrcoef do in MATLAB?",
      "answer": "`corrcoef(X)` returns the matrix of correlation coefficients for the columns of `X`. Each entry `R(i,j)` is the Pearson linear correlation between columns `i` and `j`, ranging from -1 to 1."
    },
    {
      "question": "How is corrcoef different from cov in MATLAB?",
      "answer": "`corrcoef` returns normalized correlation coefficients (always between -1 and 1), while `cov` returns the covariance matrix (in the original units squared). `corrcoef(X)` equals `cov(X) ./ (std(X)' * std(X))`."
    },
    {
      "question": "Can I compute the correlation between two vectors with corrcoef?",
      "answer": "Yes. `corrcoef(x, y)` treats `x` and `y` as two columns and returns a 2×2 correlation matrix. The off-diagonal element is the correlation coefficient between `x` and `y`."
    },
    {
      "question": "Does corrcoef handle missing data?",
      "answer": "Use `corrcoef(X, 'rows', 'complete')` to exclude any row that contains a `NaN` in any column, or `'pairwise'` to exclude NaN rows separately for each pair of columns."
    }
  ],
  "links": [
    {
      "label": "mean",
      "url": "./mean"
    },
    {
      "label": "sum",
      "url": "./sum"
    },
    {
      "label": "histcounts",
      "url": "./histcounts"
    },
    {
      "label": "gpuArray",
      "url": "./gpuarray"
    },
    {
      "label": "gather",
      "url": "./gather"
    },
    {
      "label": "cov",
      "url": "./cov"
    }
  ],
  "source": {
    "label": "`crates/runmat-runtime/src/builtins/stats/summary/corrcoef.rs`",
    "url": "https://github.com/runmat-org/runmat/blob/main/crates/runmat-runtime/src/builtins/stats/summary/corrcoef.rs"
  },
  "gpu_residency": "You usually do **not** need to call `gpuArray` explicitly. When you write expressions such as `corrcoef(sin(X))`, the fusion planner keeps the intermediate residency on the GPU as long as the active provider exposes the required hooks (for `corrcoef`, that means the custom provider kernel and `rows='all'`). RunMat still honours MATLAB semantics, so explicitly calling `gpuArray` remains useful for compatibility and for forcing GPU residency when you are unsure whether the planner will do so.",
  "gpu_behavior": [
    "When the input data resides on the GPU, RunMat asks the active acceleration provider to execute the correlation directly on the device whenever:\n\n1. All inputs are `gpuArray` values; 2. The `'rows'` option is `'all'`; and 3. The provider exposes the custom `corrcoef` hook (the WGPU provider does).\n\nIf any of these conditions is not met, the builtin gathers the data to host memory, computes the correlation matrix on the CPU reference path, and returns a dense host tensor."
  ]
}