| @brief Creates a quantized tensor of the given
| dimension.
|
| Note that the actual data allocation is not
| going to be carried out until the first time
| mutable_data() is called.
|
| The underlying storage of the quantized tensor
| interleaves elements by bit depth.
|
| Labeled memory for tensor of size 6, precision 3
|
| [ E1[0] E2[0] E3[0] E4[0] E5[0] E6[0] ] // Least significant Bits
| [ E1[1] E2[1] E3[1] E4[1] E5[1] E6[1] ]
| [ E1[2] E2[2] E3[2] E4[2] E5[2] E6[2] ]
|
| In the case of sign bits (see enable_sign
| argument), an extra bit per element is added:
|
| Labeled memory for tensor of size 6, precision
| 3, sign bit enabled
|
| [ E1[0] E2[0] E3[0] E4[0] E5[0] E6[0] ]
| [ E1[1] E2[1] E3[1] E4[1] E5[1] E6[1] ]
| [ E1[2] E2[2] E3[2] E4[2] E5[2] E6[2] ]
| [ E1[s] E2[s] E3[s] E4[s] E5[s] E6[s] ]
| Where ‘s’ is 1 if E is negative
|
| The reason for this layout is the ability to
| efficiently multiply many low precision integers
| as a sum of popcnt(A & B) * 1 << bit.
|
| Explained here:
| https://arxiv.org/abs/1606.06160