Crate caffe2_operator

source ·

Macros

declare_standard_arg
define_standarg_arg
define_tensor_types_dispatcher
gradient_operator_schema
operator_needs_feature
| A helper macro that should ONLY be used | in the operator constructor to check | if needed features are met. If not, throws | the UnsupportedOperatorFeature exception | with the given message. |
operator_schema
register_external_tensor_functions
register_hip_operator
register_hip_operator_creator
register_hip_operator_str
register_hip_operator_with_engine
register_miopen_operator
use_dispatch_helper
| Helpers to implement runtime op | polymorphism. Often it’s convenient to make an | op work on different input types (e.g. i32 vs | i64 indices) or special-case it for particular | input size (e.g. ScatterWeightedSum for block | size of 1 doesn’t need to call Eigen). | | DispatchHelper provides compile-time generation | of nested “if” statements, | e.g. DispatchHelper<FixedValues<1, | 4>>::call(this, block_size); unrolls into: | | @code | if (block_size == 1) { | return DoRunWithValue<1>(); | } else if (block_size = 4) { | return DoRunWithValue<4>(); | } else { | return DoRunWithValue<-1>(); | }| @endcode | | DoRunWithValue implementation can use template | arguments to do "if" statements or proxy to | functions in math.h which often provide fixed | size implementation. | | SimilarlyTensorTypes<int32_t, int64_t>(this, | Input(0))` provides branching based on type of | the first input and calls DoRunWithType. | | Note, that the same instance of Op class is | used as the method, not class is templated. We | might consider adding static class-level | polymorphism later. | | Convenient macro USE_DISPATCH_HELPER is | provided for declaring friendship in case | DoRunWithValue or DoRunWithType are declared | non-public.

Structs

AsyncErrorOp
AsyncNetBase
AsyncNetCancelled
AsyncNetExecutorHelper
AsyncSchedulingNet
AsyncTask
| AsyncTask represents an asynchronous | execution of a chain of ops. |
AsyncTaskFuture
| Represents the state of AsyncTask execution, | that can be queried with | | IsCompleted/IsFailed. Callbacks | are supported through SetCallback | and are called upon future’s completion. |
AsyncTaskGraph
DagUtilTestContext
DeviceTypeRegisterer
DispatchHelper
DummyAsyncOp
DummyObserver
DummySyncOp
ExecutionOptions
ExecutorHelper
ExecutorHelperDummyOp
ExternalTensorDescriptor
| This is for transferring tensor data | between C2 and backends. |
FixedValues
FooGradientDummyEngineOp
FooGradientOp
GenericTensorImplementation
| Special tag that can be listed in TensorTypes | to denote that a special implementation | in ‘RunWithOtherType’ needs to be called | instead of failing | | Obviously this needs to be the last item | in lists, e.g. | | TensorTypes<float, double, GenericTensorImplementation> |
GetFooGradient
GradientMakerStorage
GradientNotImplementedYet
| ———– | @brief | | A helper class to indicate that the gradient | mechanism is not ready. | | This should only be used sparsely when | the gradient does exist, but we have | not implemented it yet and are using | this as a lazy excuse. Eventually, a | gradient operator should be implemented. |
GradientOpsMeta
| A struct that holds the gradient operators | and related gradient maps. |
GradientWrapper
| ———– | @brief | | A struct that abstracts on top of dense | and sparse blobs. | | For a dense blob, its gradient name should | be written into dense_, and for a sparse | blob, its gradient name should be written | into indice_ for the sparse indices | and value_ for the values. |
IndividualMetrics
JustTestAndDoesConstruct
JustTestAndNeverConstructs
JustTestCUDA
JustTestCUDNN
JustTestWithNonStandardIsTestArg
JustTestWithRequiredArg
JustTestWithSomeOutput
JustTestWithStandardIsTestArg
NetBase
| Net is a thin struct that owns all the | operators together with the operator | contexts. |
NetSimpleRefCountTestOp
| A net test dummy op that does nothing | but scaffolding. | | Here, we inherit from OperatorStorage | because we instantiate on both CPU and | | GPU. | | In general, you want to only inherit | from Operator. |
NetTestDummyOp
| A net test dummy op that does nothing but | scaffolding. | | Here, we inherit from OperatorStorage because | we instantiate on both CPU and GPU. | | In general, you want to only inherit from | Operator.
NotFinishingOp
ObsTestDummyOp
Observable
| Inherit to make your class observable. |
ObserverBase
| Use this to implement a Observer using | the Observer Pattern template. |
OpGraphNode
OpSchema
| ———– | @brief | | A class to record the schema of an op. | | OpSchema records the common interface | of an op specified by its name. This is | optional for each operator implemented | in Caffe2 but is strongly recommended. | | To register an OpSchema, one can use | the macro | | OPERATOR_SCHEMA(name) and then append | the various functions in the class. | For example, for an op that takes in two | inputs, one output, and the first input | and output could be in-place, can be | written as | | OPERATOR_SCHEMA(name) | .NumInputs(2) | .NumOutputs(1) | .AllowInplace({{0, 0}}); |
OpSchemaCost
| @brief | | A struct to store various cost information | about an operator such as FLOPs, total | memory use and parameters. |
OpSchemaRegistry
| @brief | | A registry to hold all the operator schemas. | | OpSchemaRegistry should not need to | be instantiated. |
OperatorAttachingNetObserver
| Thin class that attaches the observer | to all operators in the net |
OperatorInfo
OperatorNode
OperatorStorage
ParallelNet
ParallelNetExecutorHelper
ParentCounter
ProfileCounter
ProfileObserver
| This observer displays a description | of each operator executed in a network. | | This includes input and tensors (name, | size, type), arguments, and execution | time. This can be used to analyze different | performance characteristics. | | ———– | @note | | Currently this observer only supports | synchronized computation |
ProfileOperatorObserver
RunCountNetObserver
RunCountOperatorObserver
SchemaArgument
SimpleNet
| This is the very basic structure you | need to run a network - all it does is simply | to run everything in sequence. | | If you want more fancy control such as | a DAG-like execution, check out other | better net implementations. |
SimpleRefCountNet
| SimpleRefcountNet is an implementation | that adds an additional abstraction | on top of SimpleRefCountNet: it tracks | all the tensors and for those that are | considered internal/temporary, delete | them once their refcount go to zero. | | In the context of a simple static run, | this can be carried out during construction | time: we will do a pass through the network | and track what blobs we need to do reset | on, after the execution of every op. | | To identify which blob is considered | temporary, we employ the following | strategy: any blob that is | | (1) consumed but not produced by ops | in the net, or | | (2) produced but not consumed by ops | in the net, or | | (3) is marked as external_output in | the protobuf will NOT be considered | temporary. | | In the long run, we should design proper | functional interfaces so that nets | are less imperative and more functional. | | Also, for now, SimpleRefCountNet should | only be used for benchmarking purposes | and not product use, since it is not going | to provide better performance gain, | and is implicitly incompatible with | the contract that earlier Nets expose | - that all intermediate blobs are visible | to the users. |
StaticLinkingProtector
| StaticLinkingProtector is a helper | class that ensures that the Caffe2 library | is linked correctly with whole archives | (in the case of static linking). What | happens is that when | | CreateOperator is called for the first | time, it instantiates an OperatorLinkingProtector | object to check if the operator registry | is empty. If it is empty, this means that | we are not properly linking the library. | | You should not need to use this class. |
StopOnSignal
SyncErrorOp
TensorTypes
TensorTypes2
Same as TensorTypes but call DoRunWithType2
ThrowInTheTowelIfGradientIsCalled
| ———– | @brief | | A helper class to indicate that the operator | should have no gradient. | | This is used when the operator definition | is designed to not have a gradient. | | Calling a gradient on this operator | def will cause Caffe2 to quit. |
TimeCounter
TimeObserver
TimeOperatorObserver
Tracer
TracerEvent
TracerGuard
TracingConfig
UnsupportedOperatorFeature
| An exception that can be thrown by an | operator constructor that notifies | that it does not support the given setting. | This can be usually used for specific | engines that only implement a subset | of the features required by the original | operator schema. | | TODO(jiayq): make more feature-complete | exception message. |
Workspace
| Workspace is a class that holds all the | related objects created during runtime: | (1) all blobs, and (2) all instantiated | networks. | | It is the owner of all these objects and | deals with the scaffolding logistics. |
WorkspaceBookkeeper
WorkspaceTestFoo
oGradient
| ———– | @brief | | A helper class to indicate that the operator | does not need gradient computation. | | Use the macro NO_GRADIENT to register | operators that do not have gradients. | | ———– | @note | | this is different fron SHOULD_NOT_DO_GRADIENT: | the latter means that the gradient computation | should not flow through it at all, and | throws an error if it is called. |

Enums

Constants

current_tracer_guard
kCannotComputeNumOutputs
| A const value returned by | | OpSchema::CalculateOutput() if the | number of output cannot be determined. |
kNoNetPositionSet
kSimpleNet
kSleepNetDefString
kSleepNetDefStringControlDependency
| This network has an operator writing to sleep1 | while another operator has a control dependency | on it. As a result, the operator sleep1-again | creates a write after read dependency and the | whole process should be sequential.
kSleepNetDefStringReadAfterRead
| This network has two operators reading | the same blob at the same time. This should | not change anything and the DAG should | still make sleep2 and sleep3 run in parallel. |
kSleepNetDefStringWriteAfterRead
| This network has an operator writing | to sleep1 while another operator is | accessing it. As a result, the operator | sleep1-again creates a write after | read dependency and the whole process | should be sequential. |
kTestPoolSize
kTimeThreshold
| When measuring time, we relax the measured | time by +- 40ms. |

Traits

AddRelatedBlobInfo
AnnotateEngine
AsyncTaskGraphBase
| AsyncTaskGraph represents an execution | of a net, it owns the tasks and associated | futures, sets up future callbacks and | propagates errors. | | Usage steps: | | - Adding graph nodes and edges through | CreateNode/AddDependency; | | - Freezing the graph (FreezeGraph), | after the freezing a future can be obtained | using GetFuture; | | - Execution of the graph is scheduled | through ExecuteGraph, after each execution | Reset must be called to prepare the graph | for the next run |
Cancel
CancelAsyncCallback
CheckArgumentWithName
CheckAsync
CheckEventDisabled
CheckHasDebugDef
CheckHasSingleArgumentOfType
CheckInputIsType
CheckInputSize
CheckIsInputOutputAlias
CheckLegacyOperator
CheckNetPosition
CheckOutputIsType
CheckOutputSize
CheckStream
CheckStreamFree
CopyArguments
CopyDeviceOption
CopyEngine
DebugInfoString
DisableEvent
ExternalTensorFunctionsBase
Finish
GetContext
GetContextMut
GetDebugDef
GetDeviceOption
GetEngine
GetErrorMessage
GetEvent
GetEventMut
GetExecutorHelper
GetFunctionSchema
GetGradientDefs
GetGradientOpsMeta
GetInputAtIndex
GetInputBlob
GetInputTensorShapes
GetInputs
GetOperatorDef
GetOutputAtIndex
GetOutputAtIndexTensorCopy
GetOutputAtIndex_Legacy
GetOutputBlob
GetOutputs
GetRepeatedArgument
GetSingleArgument
GetType
GetVectorFromIValueList
GetXOutput
GradOut
GradientHelpers
GradientName
GradientNameToParam
GradientSliceIndices
GradientSliceValues
IsGradientBlob
MatchGradsToParams
MoveNewstyleOutputs
NetBaseTrait
Operator
| Operator is the class that you usually | want to derive, if your operator will | run on different devices. You should | then implement the RunOnDevice() function. |
RecordEvent
RecordLastFailedOpNetPosition
ResetEvent
RunAsyncStream
RunOnDevice
RunStream
SetDebugDef
SetDense
SetEventFinished
SetExecutorHelper
SetNetPosition
SetOutputTensor
SetSparse
SingleGradientDef
SyncDeviceBarrierForObservers
VerifyOp
Wait
WaitEvent
WaitEvents

Functions

add_global_net_observer_creator
apply_potential_executor_override
async_error_net
chain_error_net
check_chaining_and_run
check_num_chains_and_run
clear_global_net_observers
compute_chains
compute_groups
| Instead of breaking down the DAG into | chains, we partition it into clusters | of sync ops and individual async op. | | This is useful for disturbuted inference | case where we have sync and async cpu | ops. | | ———– | @note | | we have go sync each aysnc op instead | of put them into the chain and sync its | tail like GPU op, because CPU async ops | are typically rpc calls and are not guaranteed | to be linearized at remote site. | | Here chains are essentially groups, | we used chain/group interchangeably |
compute_input_size_
create
create_external_tensor_functions
create_net
| ———– | @brief | | Creates a network, accessing / creating | blobs in the given workspace. | | ———– | @note | | this is different from Workspace::CreateNet. | The latter adds the created net object | to the workspace’s net map, while this | function returns a standalone net object. |
create_operator
create_operator_with_net_position
| Creates an operator with the given operator | definition. | | Throws on error and never returns nullptr
create_tensor_shape
| Helper function for creating simple | tensorproto with dimension and type |
default_overrides
extract_shard_id
| Extract the shard id from name of the form | “…shard:123…” | | Return -1 if there is no shard found
for_each_check
| Checks that Workspace::ForEach(f) | applies f on the specified set of workspaces | in any order. |
g_device_type_registry
g_global_engine_pref
g_per_op_engine_pref
get_async_net_thread_pool
get_async_task_graph
get_counter_for_net_name
get_dims_vector
Helper function
get_gradient_for_op
| ———– | @brief | | Gets the GradientOpsMeta for the given | operator def. |
get_net_def_for_test
get_net_observer_creators
get_operator_logger
get_registered_operators
Get a set of registered operator names
get_tensor_shape_of_blob
get_tracing_config_from_net
get_unique_shard_id
| Return unique shard id, or -1 if it is | not unique. |
has_enable_tracing_flag
infer_blob_shapes_and_types
infer_blob_shapes_and_types_from_map
infer_blob_shapes_and_types_from_map_with_blob_types
infer_blob_shapes_and_types_from_workspace
infer_op_input_output_device
| Helper function for infer op inputs | and outputs device information. |
is_traceable_net_name
| Check if the net name is white-listed | for tracing (specified via a command | line flag) |
load_int_8tensor_info_of_blob
n_elem_between_dim
Helper function
n_elem_from_dim
Helper function
op_registry_key
operator_logger_default
pair_larger_than
pointwise_cost_inference
prepare_chain_graph_nodes
prepare_operator_nodes
print_chains
prune
prune_op_node_graph
| Prune redundant dependencies to improve | chaining. | | TODO: t15868555 This algorithm is fast | but can miss dependencies. |
run_net_and_get_duration
| Run a network and get its duration in | milliseconds. |
set_engine_pref
set_global_engine_pref
set_op_engine_pref
set_operator_logger
Operator logging capabilities
set_per_op_engine_pref
single_chains
start_iter
test_extract_shard_id
test_prof_dagnet_error_case
try_create_operator
update_operator_nodes
validate_tensor_devices

Type Definitions

CostInferenceFunctionType
| @brief | | Registers a function that takes in an | OperatorDef and a series of input shapes | and returns the total “cost” required | to run the operator via struct by value. |
DeviceInferenceFunctionType
EnginePrefType
| User can set the preferred engines as | a list of engine names, in descending | order of preference. |
ExecutionChains
GlobalEnginePrefType
{device_type -> EnginePrefType}
NetObserver
NetObserverCreator
Observer
OperatorObserver
OperatorRegistry
| The device type registry. This works | in two phases: | | (1) gDeviceTypeRegistry() maps the | device types values to the actual operator | registry function. | | (2) Then, one can call the operator | registry function to further create | the operators. |
PerOpEnginePrefType
{device_type -> {operator_name -> EnginePrefType}}
RegistryFunction
ShouldContinue
TensorInferenceFunctionType

Trait Aliases

GradientMakerBase