tensoratu 0.0.0

# Design of Tensoratu

Tensoratu aims to provide orthogonal, composable and scalable abstractions
for working with tensors and tensor networks.

## Terminology Conventions
In accordance with what seems to be the
[prevalent convention](https://tensornetwork.org/contribute/conventions.html)
in the tensor network community we employ the term **dimension** in the way is is used
[outside of pure mathematics](https://en.wikipedia.org/wiki/Dimension#In_mathematics):
A 3-by-5 matrix has two dimensions, one is 3, the other is 5.
Its number of dimensions is 2.

## Hybrid indexing

One of the biggest challenges when dealing with large networks of tensors is organizing the huge number of indices in a way that facilitates the implementation of complicated algorithms.
Tensoratu’s hallmark, hybrid indexing, is designed to support the library user in this task.

Most linear algebra libraries and some tensor libraries identify the indices of a tensor by their ordinal number.
One tensor, for example, could have three indices numbered 0, 1, and 2.
Another tensor could be indexed in the same way.
Note that index 1 of tensor 0 is different from index 1 of tensor 1.
The (outer) product of both tensors will be a six-dimensional tensor with indices numbered 0, 1, ..., 5.
The strength of this approach is that it is simple yet very flexible.
It is useful when all indices are similar and have some natural order.
However, it is easy to get lost when dozens or even thousands of tensors are involved.

To counteract this problem
other libraries, notably itensor,
introduce dedicated “index objects” instead of integers.
One tensor, for example, could feature the indices `a`, `b`, and  `c`.
Another one could feature `c`, `d`, and `e`.
The possibility of explicitly sharing some indices but not others allows to adopt the Einstein summation convention.
Here, the index `c` is explicitly shared among the two tensors.
Thus the product of the two tensors is indexed by `a`, `b`, `d`, and `f`,
with `c` having been contracted.
Superficially, indexing errors are avoided because the order of indices never plays a role.
But it can lead to different kinds of indexing confusion:

- Often several tensors within a tensor network will have a similar or even identical structure and function.
  For example multiple CNOT gates could be present in one tensor network,
  It may be helpful to label the indices of each CNOT as `control_in`, `control_out`, `target_in`, `target_out`.
  However, this is not possible if sharing an index means contraction.

- In addition, the requirement that contraction requires the use of the same index means that `target_out` of one CNOT gate cannot  be connected to `control_in` of another CNOT gate.
  Both need to use the same index which cannot reflect their function.

- Finally, a large tensor network with n similar indices (a matrix product state, say)
  requires many separate indices like `a_0`, `a_1`, ..., `a_n`,
  which seems redundant.

Tensoratu tries to combine the benefits of both approaches while avoiding their inconveniences.
Tensor indices in tensoratu are identified by labels whose identity is tied to a string of characters.
In contrast to itensor, if the same label occurs multiple times this does not mean contraction but rather implicit subscripting.
Let’s consider the case where a tensor is labeled by `n`, `s`, `e`, and `w`:
```
     n
     |
    ---
e --| |-- w
    ---
     |
     s
```
Now two instances of this tensor are combined into a network
and the `w` index of the first one contracted with `e` index of the second one:

```
     n       n
     |       |
    ---     ---
e --|0|-- --|1|-- w
    ---     ---
     |       |
     s       s
```

The resulting network is indexed by the six labels `n`, `n`, `s`, `s`, `e`, and `w`.
Note that the labels `n` and `s` occur twice.
There is no ambiguity, since labels are implicitly subscripted in the order of their appearance.
(This works because the involved tensors occur in a particular order.)

In the above network, the two axes labeled by `n` (or `s`) can be only adressed together.
More often than not this is what is required.
If this is not the case functionality exists that allows relabeling by distinct labels.

## Basic concepts

Tensoratu is based on the following concepts (in **bold**).
Instances of each concept are values of the language and can be stored in variables.

### Individual tensors

The concepts of this subsection relate to individual tensors,
i.e. they do not yet involve the notion of a tensor network.

- A **tensor** is a n-dimensional array of numbers.
  The most common and simple tensors are internally represented by a dense or sparse arrays of numbers,
  but other variants exist as well.

- Each axis of a tensor is identified by a **label**.

- A **simple label** is uniquely identified by a string of characters.

- A **composite label** combines an ordered set of simple labels into a single one.
  It represents the result of a (tensor) product of simple labels.
  As such, labels are closed with regard to the product defined on them.

- Each simple label may be annotated by a positive integer describing its **dimension**.
  In absence of an annotation, the dimension of a label is unspecified.
  (Labels with an unspecified dimension can be useful when parts of a tensor network go through a transformation that preserves their form, but changes the bond dimension.)

- For example,
  if there is a label `a` of dimension 2 (values `0, 1`),
  and a label `b` of dimension 3 (values `0, 1, 2`),
  the composite label `a * b` has dimension 6 (values `0, 1, ..., 5`)
  corresponding to `(0, 0), (0, 1), (0, 2), (1, 0), (1, 1), (1, 2)`.
  This is the row-major convention.

- The storage order of the axes does not play a role for the use of a tensor.
  Each tensor can be seen as a n-d array where each axis is associated with a simple label,
  or a 1-d array associated with a single (often composite) label.
  It can be also seen as any combination of the above.

- The dimension of a composite label may be fully specified or not, depending on the dimensions of its constituents.
  Dimensions must be specified for enough labels so that there is no ambiguity when tensors are constructed.
  For example, when creating a tensor from an array and a label for each axis,
  there may be at most one simple label with unknown dimension per axis.
  The product of the known dimensions of its simple labels must be a divisor of the dimension of an axis.

- It is perfectly possible for a label with the same name to exist multiple times with different dimension annotations.
  This poses no problem as long as these are used for different tensors.
  Similarly, it is possible to use a label of unknown dimension with a tensor for whose construction a label with fixed dimension was provided.
  What matters is that enough dimensions are specified and those that are specified are consistent.

### Tensor networks

<!-- TODO: finish this subsection -->

The concepts of this subsection generalize tensors as tensor networks,

- As said, a tensor can be anything that formally corresponds to an array of numbers whose axes are indexed by labels.

- The prime example of a non-trivial general tensor that is a **tensor network**
  which itself represents multiple interconnected general tensors.

- A tensor network is stored as an undirected multigraph (a graph that is permitted to have parallel edges).
  The **vertices** of the graph are references to the tensors.
  Multiple vertices may refer to the same tensor
  and thus represent multiple independent instances of the same within a network.

- The vertices that make up a network are ordered.
  This is necessary for hybrid indexing.

- Within a network, vertices are connected by **edges**.

## Basic operations

The following basic operations do not involve any computations with tensors.
They serve as basic building blocks for many more advanced operations.

### Tensor creation

Although the order of labels that index a tensor does not matter for its use,
the underlying nd-array
(no matter whether dense or sparse)
does have some concrete order of axes.

As written above under “Concepts”,
a tensor is constructed from a n-d array,
and associated labels: one label for each axis.

Whenever the label of an axis is a compound label,
that label can be split up into a product of simple labels,
each with an associated axis.
This suggests a normalized storage scheme for a tensor:
split-up all the labels into simple labels,
and reshape the nd-array into as many axes.
The sequence of simple labels can then be stored as a single compound label.
Other normalized storage schemes are conceivable.
The point is that the creation of a tensor does not involve any reordering of the array values.

The bottom line is:
Each tensor can be seen as a n-d array whose axes are each associated with a simple label,

### Array views/copies of tensors

The opposite operation to the creation of a tensor is the provision of an array view of its contents.
The user provides a sequence of labels
(that must be made up from exactly the same simple labels as used for the tensor),
and an array is returned that has as many axes as labels were provided.
Each axis is indexed by the corresponding label.

Two cases must be distinguished:

- **Array view of a tensor**

  In some cases it is possible to avoid copying the array values and provide an array view of the tensor’s values.
  In particular this is possible whenever a view is requested with the same axes that were given upon tensor creation.
  A copy can be also avoided in all other cases
  with the exception of the case described in the following subsection.

  Array views are useful whenever one wants to modify a tensor.
  They can be also used as a more efficient substitute to a copy.

- **Array copy of a tensor**

  Providing a view is impossible whenever multiple simple labels are combined to label an axis of the desired array and the strides of these label’s axes do not follow upon each other.
  In other words this occurs whenever the data for any desired axis is not already ordered according to the row-major convention.
  For example, while it is possible to view an array labeled by `a, b` as `b, a`,
  it is not possible to view an array labeled by `a*b` as `b*a`.

  Array copies sometimes cannot be avoided.
  See for example the section on tensor contraction below.

### Tensor relabeling

It is sometimes useful to replacing one label by another one in a tensor.

### Flattening a tensor network

While on one hand it can be useful to be able to use tensor networks as building blocks for more complicated tensor networks,
some algorithms may require a tensor network that is built up of simple tensors only.

We call this operation the flattening of a tensor network.

## Basic computations

### Tensor contraction

Without loss of generality,
tensor contraction is an operation that combines tensor `A` labeled by `i*j`
and tensor `B` labeled by `k*l`
into one tensor `C` labeled by `i*l`.
The labels `i, j, k, l` are arbitrary and possibly compound.
The dimensions of labels `j` and `k` must match.
`C` is defined such that `C(i, l) = ∑\_j A(i, j) * B(k=j, l)`,
where the labels in this expression are to be seen as mathematical variables.

This operation is equivalent to a matrix multiplication after tensor reordering.

Tensor contraction is easy to implement if one can build upon
- Array views/copies of tensors
- Matrix product for arrays

### Splitting up a tensor: SVD

### MPS, MPO, PEPS

### MPS/MPO products (“Zipping up”)

## Implementation details (Rust)

- It should be possible to represent simple labels as compact objects (2 machine words) without any allocations, except if the label is long (UTF8 representation longer than ~12 bytes).

- It should be possible to represent simple indices as compact objects as well.

- Combined labels and indices should have a rather compact representation, but will require memory allocation.
  It may be possible to avoid the allocation for a small number of elements with an enum.

## Implementation details (Python)

- Labels might be represented by built-in strings and string-integer tuples.