ndatafusion 0.0.1-reserve.0

Extensions and support for linear algebra in DataFusion
# Exercises

Small copy-paste queries for getting comfortable with `ndatafusion`.

These exercises use the `make_*` functions so they can be pasted directly into SQL with no
preloaded Arrow extension columns. If your data source already emits canonical Arrow values, you
can usually remove the constructor call and pass the column directly.

Use the same Rust harness from [README.md](README.md), replacing only the SQL string.

## 1. Warm Up: Dot Product

Two vectors. One scalar answer.

```sql
SELECT vector_dot(
    make_vector([3.0, 4.0], 2),
    make_vector([4.0, 0.0], 2)
) AS dot
```

Expected result:

```text
dot = 12.0
```

## 2. Unit-Normalize An Embedding

Normalize a vector and confirm its length is `1`.

```sql
SELECT vector_l2_norm(
    vector_normalize(make_vector([3.0, 4.0], 2))
) AS unit_norm
```

Expected result:

```text
unit_norm = 1.0
```

## 3. Compare Two Embeddings

Cosine similarity and cosine distance are row-wise embedding primitives.

```sql
SELECT
    vector_cosine_similarity(
        make_vector([1.0, 0.0, 0.0], 3),
        make_vector([0.0, 1.0, 0.0], 3)
    ) AS similarity,
    vector_cosine_distance(
        make_vector([1.0, 0.0, 0.0], 3),
        make_vector([0.0, 1.0, 0.0], 3)
    ) AS distance
```

Expected result:

```text
similarity = 0.0
distance   = 1.0
```

## 4. Matrix Times Vector

Take a small dense matrix and multiply it by a vector.

```sql
SELECT matrix_matvec(
    make_matrix([2.0, 0.0, 0.0, 1.0], 2, 2),
    make_vector([4.0, 3.0], 2)
) AS product
```

Expected result:

```text
product = [8.0, 3.0]
```

## 5. Determinant In One Line

```sql
SELECT matrix_determinant(
    make_matrix([9.0, 0.0, 0.0, 4.0], 2, 2)
) AS det
```

Expected result:

```text
det = 36.0
```

## 6. Peek Inside QR

Some UDFs return structs. This one gives `q`, `r`, and `rank`.

```sql
SELECT qr.rank
FROM (
    SELECT matrix_qr(
        make_matrix([1.0, 2.0, 3.0, 4.0], 2, 2)
    ) AS qr
) AS t
```

Expected result:

```text
rank = 2
```

## 7. Sparse Matrix Times Dense Vector

Build a CSR sparse batch from parts and multiply it by a dense vector.

```sql
SELECT sparse_matvec(
    make_csr_matrix_batch(
        [[2, 3]],
        [[0, 2, 3]],
        [[0, 2, 1]],
        [[1.0, 2.0, 3.0]]
    ),
    make_variable_tensor([[1.0, 2.0, 3.0]], [[3]], 1)
) AS result
```

Expected result:

```text
result = [7.0, 6.0]
```

## 8. Tensor Reduction

Reduce across the last axis.

```sql
SELECT tensor_sum_last_axis(
    make_tensor([1.0, 2.0, 3.0, 4.0], 2, 2)
) AS reduced
```

Expected result:

```text
reduced = [3.0, 7.0]
```

## 9. Tensor Axis Shuffle

Permute a `2 x 3` tensor into a `3 x 2` tensor.

```sql
SELECT tensor_permute_axes(
    make_tensor([1.0, 2.0, 3.0, 4.0, 5.0, 6.0], 2, 3),
    1,
    0
) AS permuted
```

Expected result:

```text
permuted = [[1.0, 4.0], [2.0, 5.0], [3.0, 6.0]]
```

## 10. Tiny PCA Workflow

Fit PCA, then use the returned struct.

```sql
SELECT
    pca.explained_variance_ratio AS explained,
    matrix_pca_transform(matrix, pca) AS scores
FROM (
    SELECT
        make_matrix([1.0, 2.0, 3.0, 4.0, 5.0, 6.0], 3, 2) AS matrix,
        matrix_pca(make_matrix([1.0, 2.0, 3.0, 4.0, 5.0, 6.0], 3, 2)) AS pca
) AS t
```

This is a good first example of the “struct-returning UDF plus follow-on UDF” style in the
catalog.

## Direct Arrow Inputs

If your upstream source already produces the right Arrow contract, constructors are not required.

Examples:

- Dense vectors: `FixedSizeList<Float32|Float64>(D)` can go straight into `vector_dot`,
  `vector_l2_norm`, `vector_cosine_similarity`, and `vector_normalize`.
- Dense matrices and fixed-shape tensors: `arrow.fixed_shape_tensor` can go straight into matrix
  and tensor UDFs.
- Variable-shape tensors: `arrow.variable_shape_tensor` can go straight into the variable tensor
  UDF family.
- Sparse batches: `ndarrow.csr_matrix_batch` can go straight into `sparse_matvec`,
  `sparse_matmat_dense`, `sparse_matmat_sparse`, `sparse_transpose`, and `sparse_lu_solve`.