| Input is a matrix tensor. Its first dimension
| is the batch size. For each column, bucketize
| it based on the boundary values and then
| do one hot encoding. The lengths
specifies
| the number of boundary values for each
| column. The final number of buckets
| is this number plus 1. This would also
| be the expanded feature size. boundaries
| specifies all the boundary values.
|
| ———–
| @note
|
| each bucket is right-inclusive. That
| is, given boundary values [b1, b2, b3],
| the buckets are defined as (-int, b1],
| (b1, b2], (b2, b3], (b3, inf).
|
| For example
|
| data = [[2, 3], [4, 1], [2, 5]], lengths
| = [2, 3],
|
| If boundaries = [0.1, 2.5, 1, 3.1, 4.5],
| then
|
| output = [[0, 1, 0, 0, 1, 0, 0], [0, 0,
| 1, 1, 0, 0, 0], [0, 1, 0, 0, 0, 0, 1]]
|
| If boundaries = [0.1, 2.5, 1, 1, 3.1],
| then
|
| output = [[0, 1, 0, 0, 0, 1, 0], [0, 0,
| 1, 0, 1, 0, 0], [0, 1, 0, 0, 0, 0, 1]]
|
| Input is a matrix tensor. Its first dimension
| is the batch size. Expand each column
| of it using one hot encoding. The lengths
| specifies the size of each column after
| encoding, and the values
is the dictionary
| value of one-hot encoding for each column.
| For example
|
| If data = [[2, 3], [4, 1], [2, 5]], lengths
| = [2, 3], and values = [2, 4, 1, 3, 5], then
|
| output = [[1, 0, 0, 1, 0], [0, 1, 1, 0, 0],
| [1, 0, 0, 0, 1]]
|