 The TTlayer serves as a lowrank decomposition
 of a fully connected layer.

 The inputs are the same as to a fully connected
 layer, but the number of parameters
 are greatly reduced and forward computation
 time can be drastically reduced especially
 for layers with large weight matrices.

 The multiplication is computed as a
 product of the input vector with each
 of the cores that make up the TT layer.

 Given the input sizes (inp_sizes),
 output sizes(out_sizes), and the ranks
 of each of the cores (tt_ranks), the
 ith core will have size:

 inp_sizes[i] * tt_ranks[i] *
 tt_ranks[i + 1] * out_sizes[i].

 The complexity of the computation is
 dictated by the sizes of inp_sizes,
 out_sizes, and tt_ranks, where there
 is the trade off between accuracy of
 the lowrank decomposition and the
 speed of the computation.
