
  • | The FC operator computes an output $(Y)$ | as a linear combination of the input | data blob $(X)$ with a weight blob $(W)$ | and bias blob $(b)$. More formally, | | $$Y = XW^T+b$$ | | Here, $X$ is a matrix of shape $(M,K)$, | $W$ is a matrix of shape $(N,K)$, $b$ | is a vector of length $N$, and $Y$ is a | matrix of shape $(M,N)$. $N$ can be thought | of as the number of nodes in the layer, | $M$ is the batch size, and $K$ is the number | of features in an input observation. | | ———– | @note | | $X$ does not need to explicitly be a 2-dimensional | matrix, however, if it is not it will | be coerced into one. For an arbitrary | $n$-dimensional tensor $X$, e.g. $[a_0, | a_1, \ldots ,a_{k-1}, a_k, \ldots , | a_{n-1}]$, where $a_i$ in $N$, and $k$ | is the $axis$ arg provided, then $X$ | will be coerced into a 2-dimensional | tensor with dimensions $[a_0 * \ldots | * a_{k-1}, a_k * \ldots * a_{n-1}]$. | For the default case where axis=1, this | means the $X$ tensor will be coerced | into a 2D tensor of dimensions $[a_0, | a_1 \ldots * a_{n-1}]$, where $a_0$ | is often the batch size. In this situation, | we must have $a_0 = M$ and $a_1 * \ldots | * a_{n-1} = K$. Lastly, even though $b$ | is a vector of length $N$, it is copied | and resized to shape $(M x N)$ implicitly, | then added to each vector in the batch.* | | This is Caffe’s InnerProductOp, with | a name that fits its purpose better. | | Github Links: | | - | | - |

