| This class contains some common functions
| for backend lowering and graph cutting
|
| \class A class that does bound shape inference
| given a C2 net. Depending on its type, each op
| have a maximum shape that it accepts. We
| define some initial bound for certain
| dimension, for example max batch size or max
| sequnce lookup size. And the inference will
| first infer the input size and then propagates
| the bound shape down the network. For now the
| variable part (bound part) is the first
| dimension of the shape, which usually
| corresponds to the batch size or sequence
| lookup size.
| This struct stores the max bound size for batch
| in the general sense. max_batch_size is the
| upper bound of batch_size.
|
| max_seq_size is the upper bound of length of
| every item in a batch.
|
| Upper bound of length of a batch of items
| should be max_batch_size * max_seq_size.
| \brief Main graph matcher interface.
|
| This class solves a problem of finding
| a matching subgraph, which is specified in
| a text form.
| Each match is a struct of subgraph and
| map from the string used in the query
| to a NodeRef in the subgraph note:
|
| the maps are injective but not necessarily
| bijective – if you use the same name
| in the query twice only one will be mapped.
|
| See getMatches
to generate these
| structs.
|
| TORCH_API nom::repr::NNModule convertToNNModule(caffe2::NetDef
| &net, std::unordered_map<std::string,
| nom::repr::NNGraph::NodeRef>* blobMapOut
| = nullptr);
|
| TORCH_API caffe2::NetDef convertToOnnxProto(nom::repr::NNModule&);
|
| TORCH_API std::unique_ptr
nom::repr::NeuralNetOperator
| convertToOperatorDef(caffe2::OperatorDef
| op);
|
| This file sets up the optimization pass
| registry.
|
| You’ll want to either create a class
| that inherits from OptimizationPass
| and implements run or use the
|
| REGISTER_OPT_PASS_FROM_FUNC(name,
| func) to register a function that takes
| in an NNModule*.
|
| If you need access to the workspace in
| the optimization you’ll need to use
| a different registry and inherit from
|
| WorkspaceOptimizationPass.
|
| Provides slicing info for the outputs.
| All the vector members should be of the
| same size as number of outputs of the
| Onnxifi op.
|
| @note
|
| subgraph always starts with ops and
| ends with tensors, except for the very
| first group, which can be all tensors
|
| Helpers for the convertToNNModule for use if
| you already have an NNModule.
|
| You probably don’t want to use these if you
| can use convertToNNModule instead.
| In-place modify TensorBoundShape
| to change shape size based on type
|
| Helper function for convertToNQLString
| function.
|
| It takes a list of nodes and returns a map
| node->unique_name. The new names are based on
| the existing ones, but are also unique.
| Construct a ShapeInfo instance from
| TensorShape and constructed dimType.
|
| Default first dimension of dimType is BATCH,
| reason:
|
| We treat first dimension of hinted shapes as
| BATCH.
|
| If there are shape hints on blobs in the
| workspace, since they are already inserted as
| CONSTANT, it will take effect here.
|
| For SEQ typed tensors, there are only a few of
| them and they will be handled by
| BoundShapeInferencer.
| Pass in an oldNet to copy all the attributes
| of that network.
|
| Be warned that transformations that modify the
| graph’s inputs or outputs are not reflected in
| changes to external_input or external_output.
| Use these functions instead of the registry
| directly.
|
| \brief Return a string representing the given
| graph \param g.
|
| The returned string is a valid NQL query.
| Explore the graph in topological order
| until we hit stopping nodes. This is
| based on Khan’s algorithm:
|
| https://en.wikipedia.org/wiki/Topological_sorting#Kahn’s_algorithm
|
| Precondition: nodes in current_frontier
| must have satisfy in_degree == 0
|
| Extract shape info from tensorBoundShapes to
| a ShapeInfoMap.
|
| Change shape according to new max_batch_size
| and max_feature_len at the same time if
| necessary.
| Transform normal fp32 operators to
| fakefp16 operators.
|
| We have a variant of 2-input Int8Quantize and
| 4-input Int8FC where the last input points to
| a blob which contains the y_scale and
| y_zero_point.
|
| It’s orginated from online snapshot update but
| is creating complications for onnxifi flow.
|
| Hence this pass is just to absorb the
| quantization params into the op itself and
| remove the last input.
| Generic activation fusion helper.
|
| ———–
| @param OperationT
|
| The operator to be fused.
| –––––
| @param ActivationT
|
| The activation to be fused.
| –––––
| @param nn
|
| Neural network module to be modified
| in place
| –––––
| @param should_fuse
|
| Given a conv op, check whether we want
| to fuse it with subsequent relu or not
| –––––
| @param postprocess
|
| Functor to postprocess the conv node,
| attaching additional attributes if
| necessary
|
| ———–
| @brief
|
| This fuses Cast -> BatchOneHot -> Cast
| into a single call.
|
$$ X_{bn} = \frac{s(X - m)}{\sqrt{\sigma + \epsilon}} + b_{bn}$$
$$ X_{conv} = X * W + b_{conv} $$
thus, substituting $X$ with $X_{conv}$ in the BN equation we get:
$$X_{bn} = X * \frac{sW}{\sqrt{\sigma + \epsilon}} + \frac{s(b_{conv} -
m)}{\sqrt{\sigma + \epsilon}} + b_{bn}$$ or
$$ W’ = W\frac{s}{\sqrt{\sigma + \epsilon}}$$
$$ b’ = (b_{conv} - m)\frac{s}{\sqrt{\sigma + \epsilon}} + b_{bn}$$
| ———–
| @brief
|
| Create tensor-nodes in \param graph
| with names specified in \param names
| and
|
|
| ———–
| @return
|
| a name->NodeRef map.
|
Mapping from fp32 ops to fakefp16 ops
| Helper function for convertToNQLString
| function.
|
| Given a node and a renameMap return the unique
| name for this node.
| \brief Return a short string name for the
| given \param node.
|
| The function works with both tensors and
| operators.
| Helper function for convertToNQLString
| function.
|
| Given a node and a renameMap return a string
| representing the node, which looks something
| like:
|
| %a = Op(%b, %c, %d)
| If the annotation doesn’t exist, attempt
| to add it
|
Generates ShapeInfo from Blob.
| Given a net, with primiary inputs and
| outputs defined in its external_inputs/outputs,
| and given the set of weights and extra
| weights (created during conversion
| to ONNX if exists), we check whether
| some of the weights are used in the net,
| and if so, we put it in the initialize_list
| and add it to the external_inputs too.
|
| ———–
| @param net
|
| [in] c2 net (cutoff from a bigger net)
| –––––
| @param weights_in_ws
|
| [in] all the weights in the workspace
|
| conversion \param initialization_list
| [out] weights that needs
| to be offload to backend
| –––––
| @param total_inputs_vec
|
| [out] total #inputs of the net that doesn’t
| have a producer
|
| In-place modify TensorShape’s shape
| at a specific dimension
|
| Onnxifi transformation on the net and
| workspace. We also needed the input
| data/shape to populate the shape. In addition,
| we take a \p blocklist to control and mask
| what ops we want to consider in onnxifi
| process. We can also set whether to use ONNX
| proto or C2 proto through ONNXIFI interface.
|
| The list in in the form of “0-3,5,6-7”
| which means, we will black list ops with
| net positions in [0,1,2,3,5,6,7]
|
| Split SparseLengthsSumSparse into
|
| SparseLengthsSumSparseLookup + SparseLengthsSum
|
Convert ShapeInfo map to TensorShape map
| Check precedence between two vector of
| TensorBoundShape::DimType.
|
| If return 1: right take precedence over left
| If return -1: left take precedence over right
| If return 0: no precedence between left and right
Helper function to clean up a net and run tvm transform.
Wrap Quantized TensorShape into QTensorProto
Wrap TensorShape into TensorProto