pub struct INetworkDefinition { /* private fields */ }Expand description
INetworkDefinition
A network definition for input to the builder.
A network definition defines the structure of the network, and combined with a IBuilderConfig, is built into an engine using an IBuilder. An INetworkDefinition can have all dimensions explicit, full dims mode, in the network definition. The former mode, i.e. the implicit batch size mode, has been deprecated.
A network with implicit batch dimensions returns the dimensions of a layer without the implicit dimension, and instead the batch is specified at execute/enqueue time. If the network has all dimensions specified, then the first dimension follows elementwise broadcast rules: if it is 1 for some inputs and is some value N for all other inputs, then the first dimension of each output is N, and the inputs with 1 for the first dimension are broadcast. Having divergent batch sizes across inputs to a layer is not supported.
Do not inherit from this class, as doing so will break forward-compatibility of the API and ABI.
Implementations§
Source§impl INetworkDefinition
impl INetworkDefinition
Sourcepub fn addTopK1(
self: Pin<&mut Self>,
input: Pin<&mut ITensor>,
op: TopKOperation,
k: i32,
reduceAxes: u32,
indicesType: DataType,
) -> *mut ITopKLayer
pub fn addTopK1( self: Pin<&mut Self>, input: Pin<&mut ITensor>, op: TopKOperation, k: i32, reduceAxes: u32, indicesType: DataType, ) -> *mut ITopKLayer
Add a TopK layer to the network.
The TopK layer has two outputs of the same dimensions. The first contains data values, the second contains index positions for the values. Output values are sorted, largest first for operation kMAX and smallest first for operation kMIN.
Currently only values of K up to 3840 are supported.
-
inputThe input tensor to the layer. -
opOperation to perform. -
kThe number of elements to keep. For dynamic k, use the setInput() method to pass in k as a tensor instead, which will override the static k value passed here in calculations. -
reduceAxesThe reduction dimensions. The bit in position i of bitmask reduceAxes corresponds to explicit dimension i of the result. E.g., the least significant bit corresponds to the first explicit dimension and the next to least significant bit corresponds to the second explicit dimension. Currently reduceAxes must specify exactly one dimension, and it must be one of the last four dimensions. -
indicesTypeIndices tensor (the second output) data type, must be DataType::kINT32 or DataType::kINT64.
See [ITopKLayer]
The new TopK layer, or nullptr if it could not be created.
Sourcepub fn addNonZero1(
self: Pin<&mut Self>,
input: Pin<&mut ITensor>,
indicesType: DataType,
) -> *mut INonZeroLayer
pub fn addNonZero1( self: Pin<&mut Self>, input: Pin<&mut ITensor>, indicesType: DataType, ) -> *mut INonZeroLayer
Add a nonzero layer to the network.
-
inputThe input tensor to the layer. -
indicesTypeIndices tensor (the first output) data type, must be DataType::kINT32 or DataType::kINT64.
See [INonZeroLayer]
The new nonzero layer, or nullptr if it could not be created.
Sourcepub fn addFill(
self: Pin<&mut Self>,
dimensions: &Dims64,
op: FillOperation,
outputType: DataType,
) -> *mut IFillLayer
pub fn addFill( self: Pin<&mut Self>, dimensions: &Dims64, op: FillOperation, outputType: DataType, ) -> *mut IFillLayer
Add a fill layer to the network.
dimensionsThe output tensor dimensions if input 0 is missing.opThe fill operation that the layer applies.outputTypeOptional output tensor data type, must be DataType::kFLOAT, DataType::kHALF, DataType::kINT32, or DataType::kINT64. This parameter is only used for static alpha/beta. Future calls to set output type using setToType or setOutputType must be consistent.
For FillOperation::kLINSPACE, dimensions.nbDims must be 1 for static start/delta. If delta is provided as a 1D tensor, the length of delta must match dimensions.nbDims.
This layer is non-deterministic across subsequent calls as the same inputs will produce different output tensors if op is either FillOperation::kRANDOM_UNIFORM or FillOperation::kRANDOM_NORMAL due to random state being shared across calls. The output tensors generated are deterministic when starting from the same initial state.
See [IFillLayer]
The new fill layer, or nullptr if it could not be created.
Sourcepub fn addDequantize(
self: Pin<&mut Self>,
input: Pin<&mut ITensor>,
scale: Pin<&mut ITensor>,
outputType: DataType,
) -> *mut IDequantizeLayer
pub fn addDequantize( self: Pin<&mut Self>, input: Pin<&mut ITensor>, scale: Pin<&mut ITensor>, outputType: DataType, ) -> *mut IDequantizeLayer
Add a dequantization layer to the network.
inputThe input tensor to be dequantized.scaleA tensor with the scale value.outputTypeOutput tensor data type.
See [IDequantizeLayer]
input tensor data type must be DataType::kINT8, DataType::kFP8, DataType::kINT4 or DataType::kFP4. scale tensor data type must be one of the following: DataType::kFLOAT (default), DataType::kHALF, DataType::kBF16 or DataType::kE8M0 (for MXFP8 quantization). outputType output tensor data type must be DataType::kFLOAT (default), DataType::kHALF or DataType::kBF16. Future calls to set output type using setToType or setOutputType must be consistent. For strongly typed networks, if the scale type is DataType::kHALF or DataType::kBF16 the output type must match.
The new quantization layer, or nullptr if it could not be created.
Sourcepub fn addQuantize(
self: Pin<&mut Self>,
input: Pin<&mut ITensor>,
scale: Pin<&mut ITensor>,
outputType: DataType,
) -> *mut IQuantizeLayer
pub fn addQuantize( self: Pin<&mut Self>, input: Pin<&mut ITensor>, scale: Pin<&mut ITensor>, outputType: DataType, ) -> *mut IQuantizeLayer
Add a quantization layer to the network.
inputThe input tensor to be quantized.scaleA tensor with the scale value.outputTypeOutput tensor data type.
See [IQuantizeLayer]
input tensor data type must be DataType::kFLOAT, DataType::kHALF or DataType::kBF16. scale tensor data type must be one of the following: DataType::kFLOAT (default), DataType::kHALF, DataType::kBF16 or DataType::kE8M0 (for MXFP8 quantization). outputType output tensor data type must be DataType::kINT8 (default), DataType::kFP8, DataType::kINT4 or DataType::kFP4. Future calls to set output type using setToType or setOutputType must be consistent. For strongly typed networks, if the scale type is DataType::kHALF or DataType::kBF16 the output type must match.
The new quantization layer, or nullptr if it could not be created.
Sourcepub fn addNMS1(
self: Pin<&mut Self>,
boxes: Pin<&mut ITensor>,
scores: Pin<&mut ITensor>,
maxOutputBoxesPerClass: Pin<&mut ITensor>,
indicesType: DataType,
) -> *mut INMSLayer
pub fn addNMS1( self: Pin<&mut Self>, boxes: Pin<&mut ITensor>, scores: Pin<&mut ITensor>, maxOutputBoxesPerClass: Pin<&mut ITensor>, indicesType: DataType, ) -> *mut INMSLayer
Add a non-maximum suppression layer to the network.
-
boxesThe input boxes tensor to the layer. -
scoresThe input scores tensor to the layer. -
maxOutputBoxesPerClassThe input maxOutputBoxesPerClass tensor to the layer. -
indicesTypeIndices tensor (the first output) data type, must be DataType::kINT32 or DataType::kINT64.
See [INMSLayer]
The new NMS layer, or nullptr if it could not be created.
Source§impl INetworkDefinition
impl INetworkDefinition
Sourcepub unsafe fn addInput(
self: Pin<&mut INetworkDefinition>,
name: *const c_char,
type_: DataType,
dimensions: &Dims64,
) -> *mut ITensor
pub unsafe fn addInput( self: Pin<&mut INetworkDefinition>, name: *const c_char, type_: DataType, dimensions: &Dims64, ) -> *mut ITensor
Add an input tensor to the network.
Each input and output tensor must have a unique name.
For networks with wildcard dimensions, the volume is based on the maxima specified by an IOptimizationProfile.Dimensions are normally non-negative integers. The exception is that in networks with all explicit dimensions, -1 can be used as a wildcard for a dimension to be specified at runtime. Input tensors with such a wildcard must have a corresponding entry in the IOptimizationProfiles indicating the permitted extrema, and the input dimensions must be set by IExecutionContext::setInputShape. Different IExecutionContext instances can have different dimensions. Wildcard dimensions are only supported for EngineCapability::kSTANDARD. They are not supported in safety contexts. DLA does not support Wildcard dimensions.
Tensor dimensions are specified independent of format. For example, if a tensor is formatted in “NHWC” or a vectorized format, the dimensions are still specified in the order{N, C, H, W}. For 2D images with a channel dimension, the last three dimensions are always {C,H,W}. For 3D images with a channel dimension, the last four dimensions are always {C,D,H,W}.
nameThe name of the tensor.typeThe type of the data held in the tensor.dimensionsThe dimensions of the tensor.
It is an error to specify a wildcard value on a dimension that is determined by trained parameters.
If run on DLA with explicit dimensions, only leading dimension can be a wildcard. And provided profile must have same minimum, optimum, and maximum dimensions.
The string name must be null-terminated, and be at most 4096 bytes including the terminator.
See ITensor
The new tensor or nullptr if there is an error.
Sourcepub fn markOutput(self: Pin<&mut INetworkDefinition>, tensor: Pin<&mut ITensor>)
pub fn markOutput(self: Pin<&mut INetworkDefinition>, tensor: Pin<&mut ITensor>)
Mark a tensor as a network output.
tensorThe tensor to mark as an output tensor.
It is an error to mark a network input as an output. It is an error to mark a tensor inside an ILoop or an IIfConditional as an output.
Sourcepub fn markDebug(
self: Pin<&mut INetworkDefinition>,
tensor: Pin<&mut ITensor>,
) -> bool
pub fn markDebug( self: Pin<&mut INetworkDefinition>, tensor: Pin<&mut ITensor>, ) -> bool
Mark a tensor as a debug tensor.
A debug tensor can be optionally emitted at runtime. Note that tensor names are required to specify debug tensors at runtime.
tensorTensor to be marked as debug
True if tensor successfully marked (or was already marked), false otherwise.
See [unmarkDebug()], IExecutionContext::setDebugListener(), ITensor::setName()
Sourcepub fn unmarkDebug(
self: Pin<&mut INetworkDefinition>,
tensor: Pin<&mut ITensor>,
) -> bool
pub fn unmarkDebug( self: Pin<&mut INetworkDefinition>, tensor: Pin<&mut ITensor>, ) -> bool
Unmark a tensor as a debug tensor.
Remove the marking of a tensor as a debug tensor.
tensorTensor to be unmarked as debug.
True if tensor successfully unmarked (or was already unmarked), false otherwise.
See [markDebug()], IExecutionContext::setDebugListener()
Sourcepub fn isDebugTensor(self: &INetworkDefinition, tensor: &ITensor) -> bool
pub fn isDebugTensor(self: &INetworkDefinition, tensor: &ITensor) -> bool
Check if a tensor is marked as debug tensor.
true if tensor is marked as debug tensor, false otherwise.
Sourcepub fn markUnfusedTensorsAsDebugTensors(
self: Pin<&mut INetworkDefinition>,
) -> bool
pub fn markUnfusedTensorsAsDebugTensors( self: Pin<&mut INetworkDefinition>, ) -> bool
Mark unfused tensors as debug tensors.
Debug tensors can be optionally emitted at runtime. Tensors that are fused by the optimizer will not be emitted. Tensors marked this way will not prevent fusion like markDebug() does, thus preserving performance.
Tensors marked this way cannot be detected by isDebugTensor(). DebugListener can only get internal tensor names instead of the original tensor names in the NetworkDefinition for tensors marked this way. But the names correspond to the names obtained by IEngineInspector. There is no guarantee that all unfused tensors are marked.
True if tensors were successfully marked (or were already marked), false otherwise.
See [unmarkUnfusedTensorsAsDebugTensors()], markDebug(), IExecutionContext::setDebugListener()
Sourcepub fn unmarkUnfusedTensorsAsDebugTensors(
self: Pin<&mut INetworkDefinition>,
) -> bool
pub fn unmarkUnfusedTensorsAsDebugTensors( self: Pin<&mut INetworkDefinition>, ) -> bool
Undo the marking of unfused tensors as debug tensors.
This has no effect on tensors marked by markDebug().
True if tensor successfully unmarked (or was already unmarked), false otherwise.
See [markUnfusedTensorsAsDebugTensors()], IExecutionContext::setDebugListener()
Sourcepub fn addActivation(
self: Pin<&mut INetworkDefinition>,
input: Pin<&mut ITensor>,
type_: ActivationType,
) -> *mut IActivationLayer
pub fn addActivation( self: Pin<&mut INetworkDefinition>, input: Pin<&mut ITensor>, type_: ActivationType, ) -> *mut IActivationLayer
Add an activation layer to the network.
inputThe input tensor to the layer.typeThe type of activation function to apply.
Note that the setAlpha() and setBeta() methods must be used on the output for activations that require these parameters.
See IActivationLayer ActivationType
Int32 and Int64 are valid only for activation type kRELU.
The new activation layer, or nullptr if it could not be created.
Sourcepub fn addLRN(
self: Pin<&mut INetworkDefinition>,
input: Pin<&mut ITensor>,
window: i64,
alpha: f32,
beta: f32,
k: f32,
) -> *mut ILRNLayer
pub fn addLRN( self: Pin<&mut INetworkDefinition>, input: Pin<&mut ITensor>, window: i64, alpha: f32, beta: f32, k: f32, ) -> *mut ILRNLayer
Add a LRN layer to the network.
inputThe input tensor to the layer.windowThe size of the window.alphaThe alpha value for the LRN computation.betaThe beta value for the LRN computation.kThe k value for the LRN computation.
See ILRNLayer
Int32 tensors are not valid input tensors.
The new LRN layer, or nullptr if it could not be created.
Sourcepub fn addScale(
self: Pin<&mut INetworkDefinition>,
input: Pin<&mut ITensor>,
mode: ScaleMode,
shift: Weights,
scale: Weights,
power: Weights,
) -> *mut IScaleLayer
pub fn addScale( self: Pin<&mut INetworkDefinition>, input: Pin<&mut ITensor>, mode: ScaleMode, shift: Weights, scale: Weights, power: Weights, ) -> *mut IScaleLayer
Add a Scale layer to the network.
inputThe input tensor to the layer. This tensor must have at least 4 dimensions.modeThe scaling mode.shiftThe shift value.scaleThe scale value.powerThe power value.
If the weights are available, then the size of weights are dependent on the ScaleMode. For ScaleMode::kUNIFORM, the number of weights equals 1. For ScaleMode::kCHANNEL, the number of weights equals the channel dimension. For ScaleMode::kELEMENTWISE, the number of weights equals the product of the last three dimensions of the input.
See [addScaleNd]
See IScaleLayer
Int32 tensors are not valid input tensors.
The new Scale layer, or nullptr if it could not be created.
Sourcepub fn addSoftMax(
self: Pin<&mut INetworkDefinition>,
input: Pin<&mut ITensor>,
) -> *mut ISoftMaxLayer
pub fn addSoftMax( self: Pin<&mut INetworkDefinition>, input: Pin<&mut ITensor>, ) -> *mut ISoftMaxLayer
Add a SoftMax layer to the network.
See ISoftMaxLayer
Int32 tensors are not valid input tensors.
The new SoftMax layer, or nullptr if it could not be created.
Sourcepub fn addElementWise(
self: Pin<&mut INetworkDefinition>,
input1: Pin<&mut ITensor>,
input2: Pin<&mut ITensor>,
op: ElementWiseOperation,
) -> *mut IElementWiseLayer
pub fn addElementWise( self: Pin<&mut INetworkDefinition>, input1: Pin<&mut ITensor>, input2: Pin<&mut ITensor>, op: ElementWiseOperation, ) -> *mut IElementWiseLayer
Add an elementwise layer to the network.
input1The first input tensor to the layer.input2The second input tensor to the layer.opThe binary operation that the layer applies.
The input tensors must have the same rank and compatible type. Two types are compatible if they are the same type or are both in the set {kFLOAT, kHALF}. For each dimension, their lengths must match, or one of them must be one. In the latter case, the tensor is broadcast along that axis.
The output tensor has the same rank as the inputs. For each dimension, its length is the maximum of the lengths of the corresponding input dimension.
The inputs are shape tensors if the output is a shape tensor.
The new elementwise layer, or nullptr if it could not be created.
Sourcepub fn addUnary(
self: Pin<&mut INetworkDefinition>,
input: Pin<&mut ITensor>,
operation: UnaryOperation,
) -> *mut IUnaryLayer
pub fn addUnary( self: Pin<&mut INetworkDefinition>, input: Pin<&mut ITensor>, operation: UnaryOperation, ) -> *mut IUnaryLayer
Add a unary layer to the network.
inputThe input tensor to the layer.operationThe operation to apply.
See IUnaryLayer
Generally the input must have a floating-point type (or kINT8 as a quantized float), except for the following operations:
- kSIGN accepts a floating-point or Int32 tensor.
- kNOT requires a Bool tensor.
The input is a shape tensor if the output is a shape tensor.
The new unary layer, or nullptr if it could not be created
Sourcepub fn addShuffle(
self: Pin<&mut INetworkDefinition>,
input: Pin<&mut ITensor>,
) -> *mut IShuffleLayer
pub fn addShuffle( self: Pin<&mut INetworkDefinition>, input: Pin<&mut ITensor>, ) -> *mut IShuffleLayer
Add a shuffle layer to the network.
inputThe input tensor to the layer.
See IShuffleLayer
The new shuffle layer, or nullptr if it could not be created.
Sourcepub fn addOneHot(
self: Pin<&mut INetworkDefinition>,
indices: Pin<&mut ITensor>,
values: Pin<&mut ITensor>,
depth: Pin<&mut ITensor>,
axis: i32,
) -> *mut IOneHotLayer
pub fn addOneHot( self: Pin<&mut INetworkDefinition>, indices: Pin<&mut ITensor>, values: Pin<&mut ITensor>, depth: Pin<&mut ITensor>, axis: i32, ) -> *mut IOneHotLayer
Add a OneHot layer to the network.
indices- tensor containing indices where on_value should be set.values- a 2-element tensor, consisting of [off_value, on_value].depth- a shape tensor containing the width of the added one-hot dimension.axis- the axis to add the one-hot encoding to.
See IOneHotLayer
The new OneHot layer, or nullptr if it could not be created.
Sourcepub fn getNbLayers(self: &INetworkDefinition) -> i32
pub fn getNbLayers(self: &INetworkDefinition) -> i32
Get the number of layers in the network.
The number of layers in the network.
See [getLayer()]
Sourcepub fn getLayer(self: &INetworkDefinition, index: i32) -> *mut ILayer
pub fn getLayer(self: &INetworkDefinition, index: i32) -> *mut ILayer
Get the layer specified by the given index.
indexThe index of the layer.
The layer, or nullptr if the index is out of range.
See [getNbLayers()]
Sourcepub fn getNbInputs(self: &INetworkDefinition) -> i32
pub fn getNbInputs(self: &INetworkDefinition) -> i32
Get the number of inputs in the network.
The number of inputs in the network.
See [getInput()]
Sourcepub fn getInput(self: &INetworkDefinition, index: i32) -> *mut ITensor
pub fn getInput(self: &INetworkDefinition, index: i32) -> *mut ITensor
Get the input tensor specified by the given index.
indexThe index of the input tensor.
The input tensor, or nullptr if the index is out of range.
adding inputs invalidates indexing here
See [getNbInputs()]
Sourcepub fn getNbOutputs(self: &INetworkDefinition) -> i32
pub fn getNbOutputs(self: &INetworkDefinition) -> i32
Get the number of outputs in the network.
The outputs include those marked by markOutput or markOutputForShapes.
The number of outputs in the network.
See [getOutput()]
Sourcepub fn getOutput(self: &INetworkDefinition, index: i32) -> *mut ITensor
pub fn getOutput(self: &INetworkDefinition, index: i32) -> *mut ITensor
Get the output tensor specified by the given index.
indexThe index of the output tensor.
The output tensor, or nullptr if the index is out of range.
adding inputs invalidates indexing here
See [getNbOutputs()]
Sourcepub fn addReduce(
self: Pin<&mut INetworkDefinition>,
input: Pin<&mut ITensor>,
operation: ReduceOperation,
reduceAxes: u32,
keepDimensions: bool,
) -> *mut IReduceLayer
pub fn addReduce( self: Pin<&mut INetworkDefinition>, input: Pin<&mut ITensor>, operation: ReduceOperation, reduceAxes: u32, keepDimensions: bool, ) -> *mut IReduceLayer
Add a reduce layer to the network.
inputThe input tensor to the layer.operationThe reduction operation to perform.reduceAxesThe reduction dimensions. The bit in position i of bitmask reduceAxes corresponds to explicit dimension i if result. E.g., the least significant bit corresponds to the first explicit dimension and the next to least significant bit corresponds to the second explicit dimension.keepDimensionsThe boolean that specifies whether or not to keep the reduced dimensions in the output of the layer.
The reduce layer works by performing an operation specified by operation to reduce the tensor input across the axes specified by reduceAxes.
See IReduceLayer
If output is an Int32 or Int64 shape tensor, ReduceOperation::kAVG is unsupported.
The new reduce layer, or nullptr if it could not be created.
Sourcepub fn addTopK(
self: Pin<&mut INetworkDefinition>,
input: Pin<&mut ITensor>,
op: TopKOperation,
k: i32,
reduceAxes: u32,
) -> *mut ITopKLayer
pub fn addTopK( self: Pin<&mut INetworkDefinition>, input: Pin<&mut ITensor>, op: TopKOperation, k: i32, reduceAxes: u32, ) -> *mut ITopKLayer
Add a TopK layer to the network.
The TopK layer has two outputs of the same dimensions. The first contains data values, the second contains index positions for the values. Output values are sorted, largest first for operation kMAX and smallest first for operation kMIN.
Currently only values of K up to 3840 are supported.
The default indices tensor (the second output) data type is DataType::kINT32.
-
inputThe input tensor to the layer. -
opOperation to perform. -
kThe number of elements to keep. For dynamic k, use the setInput() method to pass in k as a tensor instead, which will override the static k value passed here in calculations. -
reduceAxesThe reduction dimensions. The bit in position i of bitmask reduceAxes corresponds to explicit dimension i of the result. E.g., the least significant bit corresponds to the first explicit dimension and the next to least significant bit corresponds to the second explicit dimension. Currently reduceAxes must specify exactly one dimension, and it must be one of the last four dimensions.
See ITopKLayer
The new TopK layer, or nullptr if it could not be created.
Deprecated in TensorRT 10.14. Superseded by five-argument addTopK.
Sourcepub fn addGather(
self: Pin<&mut INetworkDefinition>,
data: Pin<&mut ITensor>,
indices: Pin<&mut ITensor>,
axis: i32,
) -> *mut IGatherLayer
pub fn addGather( self: Pin<&mut INetworkDefinition>, data: Pin<&mut ITensor>, indices: Pin<&mut ITensor>, axis: i32, ) -> *mut IGatherLayer
Add gather with mode GatherMode::kDEFAULT and specified axis and nbElementWiseDims=0.
dataThe tensor to gather values from.indicesThe tensor to get indices from to populate the output tensor.axisThe axis in the data tensor to gather on.
See IGatherLayer
The new gather layer, or nullptr if it could not be created.
Sourcepub fn addGatherV2(
self: Pin<&mut INetworkDefinition>,
data: Pin<&mut ITensor>,
indices: Pin<&mut ITensor>,
mode: GatherMode,
) -> *mut IGatherLayer
pub fn addGatherV2( self: Pin<&mut INetworkDefinition>, data: Pin<&mut ITensor>, indices: Pin<&mut ITensor>, mode: GatherMode, ) -> *mut IGatherLayer
Add gather with specified mode, axis=0 and nbElementWiseDims=0.
dataThe tensor to gather values from.indicesThe tensor to get indices from to populate the output tensor.modeThe gather mode.
See IGatherLayer
The new gather layer, or nullptr if it could not be created.
Sourcepub fn addRaggedSoftMax(
self: Pin<&mut INetworkDefinition>,
input: Pin<&mut ITensor>,
bounds: Pin<&mut ITensor>,
) -> *mut IRaggedSoftMaxLayer
pub fn addRaggedSoftMax( self: Pin<&mut INetworkDefinition>, input: Pin<&mut ITensor>, bounds: Pin<&mut ITensor>, ) -> *mut IRaggedSoftMaxLayer
Add a RaggedSoftMax layer to the network.
inputThe ZxS input tensor.boundsThe Zx1 bounds tensor.
The bounds tensor cannot have the last dimension be the wildcard character. Int32 tensors are not valid input tensors. The input and bounds tensors should be 3D tensors.
The new RaggedSoftMax layer, or nullptr if it could not be created.
Sourcepub fn addMatrixMultiply(
self: Pin<&mut INetworkDefinition>,
input0: Pin<&mut ITensor>,
op0: MatrixOperation,
input1: Pin<&mut ITensor>,
op1: MatrixOperation,
) -> *mut IMatrixMultiplyLayer
pub fn addMatrixMultiply( self: Pin<&mut INetworkDefinition>, input0: Pin<&mut ITensor>, op0: MatrixOperation, input1: Pin<&mut ITensor>, op1: MatrixOperation, ) -> *mut IMatrixMultiplyLayer
Add a MatrixMultiply layer to the network.
input0The first input tensor (commonly A).op0The operation to apply to input0.input1The second input tensor (commonly B).op1The operation to apply to input1.
The inputs are shape tensors if the output is a shape tensor.
Int32 tensors are not valid input tensors.
The new matrix multiply layer, or nullptr if it could not be created.
Sourcepub fn addNonZero(
self: Pin<&mut INetworkDefinition>,
input: Pin<&mut ITensor>,
) -> *mut INonZeroLayer
pub fn addNonZero( self: Pin<&mut INetworkDefinition>, input: Pin<&mut ITensor>, ) -> *mut INonZeroLayer
Add a nonzero layer to the network.
The default indices tensor (the first output) data type is DataType::kINT32.
inputThe input tensor to the layer.
See INonZeroLayer
The new nonzero layer, or nullptr if it could not be created.
Deprecated in TensorRT 10.14. Superseded by two-argument addNonZero.
Sourcepub fn addConstant(
self: Pin<&mut INetworkDefinition>,
dimensions: &Dims64,
weights: Weights,
) -> *mut IConstantLayer
pub fn addConstant( self: Pin<&mut INetworkDefinition>, dimensions: &Dims64, weights: Weights, ) -> *mut IConstantLayer
Add a constant layer to the network.
dimensionsThe dimensions of the constant.weightsThe constant value, represented as weights.
See IConstantLayer
The new constant layer, or nullptr if it could not be created.
If weights.type is DataType::kINT32, the output is a tensor of 32-bit indices. Otherwise the output is a tensor of real values and the output type will be follow TensorRT’s normal precision rules.
If a wildcard dimension is used, the volume of the runtime dimensions must equal the number of weights specified.
DataType::kUINT8 not supported.
Sourcepub fn addIdentity(
self: Pin<&mut INetworkDefinition>,
input: Pin<&mut ITensor>,
) -> *mut IIdentityLayer
pub fn addIdentity( self: Pin<&mut INetworkDefinition>, input: Pin<&mut ITensor>, ) -> *mut IIdentityLayer
Add an identity layer.
inputThe input tensor to the layer.
See IIdentityLayer
The new identity layer, or nullptr if it could not be created.
Sourcepub fn addCast(
self: Pin<&mut INetworkDefinition>,
input: Pin<&mut ITensor>,
toType: DataType,
) -> *mut ICastLayer
pub fn addCast( self: Pin<&mut INetworkDefinition>, input: Pin<&mut ITensor>, toType: DataType, ) -> *mut ICastLayer
Add a cast layer.
inputThe input tensor to the layer.toTypeThe DataType of the output tensor
See ICastLayer
The new cast layer, or nullptr if it could not be created.
Sourcepub fn removeTensor(
self: Pin<&mut INetworkDefinition>,
tensor: Pin<&mut ITensor>,
)
pub fn removeTensor( self: Pin<&mut INetworkDefinition>, tensor: Pin<&mut ITensor>, )
remove a tensor from the network definition.
tensorthe tensor to remove
It is illegal to remove a tensor that is the input or output of a layer. if this method is called with such a tensor, a warning will be emitted on the log and the call will be ignored. Its intended use is to remove detached tensors after e.g. concatenating two networks with Layer::setInput().
Sourcepub fn unmarkOutput(
self: Pin<&mut INetworkDefinition>,
tensor: Pin<&mut ITensor>,
)
pub fn unmarkOutput( self: Pin<&mut INetworkDefinition>, tensor: Pin<&mut ITensor>, )
unmark a tensor as a network output.
tensorThe tensor to unmark as an output tensor.
see markOutput()
Sourcepub fn addSlice(
self: Pin<&mut INetworkDefinition>,
input: Pin<&mut ITensor>,
start: &Dims64,
size: &Dims64,
stride: &Dims64,
) -> *mut ISliceLayer
pub fn addSlice( self: Pin<&mut INetworkDefinition>, input: Pin<&mut ITensor>, start: &Dims64, size: &Dims64, stride: &Dims64, ) -> *mut ISliceLayer
Add a slice layer to the network.
inputThe input tensor to the layer.startThe start offsetsizeThe output dimensionstrideThe slicing stride
Positive, negative, zero stride values, and combinations of them in different dimensions are allowed.
See ISliceLayer
The new slice layer, or nullptr if it could not be created.
Sourcepub unsafe fn setName(self: Pin<&mut INetworkDefinition>, name: *const c_char)
pub unsafe fn setName(self: Pin<&mut INetworkDefinition>, name: *const c_char)
Sets the name of the network.
nameThe name to assign to this network.
Set the name of the network so that it can be associated with a built engine. The name must be a null-terminated C-style string. TensorRT makes no use of this string except storing it as part of the engine so that it may be retrieved at runtime. A name unique to the builder will be generated by default.
This method copies the name string.
The string name must be null-terminated, and be at most 4096 bytes including the terminator.
See INetworkDefinition::getName(), ISafeCudaEngine::getName()
none
Sourcepub fn getName(self: &INetworkDefinition) -> *const c_char
pub fn getName(self: &INetworkDefinition) -> *const c_char
Returns the name associated with the network.
The memory pointed to by getName() is owned by the INetworkDefinition object.
See INetworkDefinition::setName()
A null-terminated C-style string representing the name of the network.
Sourcepub fn addShape(
self: Pin<&mut INetworkDefinition>,
input: Pin<&mut ITensor>,
) -> *mut IShapeLayer
pub fn addShape( self: Pin<&mut INetworkDefinition>, input: Pin<&mut ITensor>, ) -> *mut IShapeLayer
Add a shape layer to the network.
inputThe input tensor to the layer.
See IShapeLayer
addShape is only supported when hasImplicitBatchDimensions is false.
The new shape layer, or nullptr if it could not be created.
Sourcepub fn getFlags(self: &INetworkDefinition) -> u32
pub fn getFlags(self: &INetworkDefinition) -> u32
Get the network definition creation flags for this network definition object. Defaults to 0.
The network definition creation options as a bitmask.
Sourcepub fn getFlag(
self: &INetworkDefinition,
networkDefinitionCreationFlag: NetworkDefinitionCreationFlag,
) -> bool
pub fn getFlag( self: &INetworkDefinition, networkDefinitionCreationFlag: NetworkDefinitionCreationFlag, ) -> bool
Returns true if the network definition creation flag is set
See [getFlags()]
True if flag is set, false if unset.
Sourcepub fn markOutputForShapes(
self: Pin<&mut INetworkDefinition>,
tensor: Pin<&mut ITensor>,
) -> bool
pub fn markOutputForShapes( self: Pin<&mut INetworkDefinition>, tensor: Pin<&mut ITensor>, ) -> bool
Enable tensor’s value to be computed by IExecutionContext::getShapeBinding.
True if successful, false if tensor is already marked as an output.
The tensor must be of type DataType::kINT32 and have no more than one dimension.
The tensor must have dimensions that can be determined to be constants at build time.
It is an error to mark a network input as a shape output.
Sourcepub fn unmarkOutputForShapes(
self: Pin<&mut INetworkDefinition>,
tensor: Pin<&mut ITensor>,
) -> bool
pub fn unmarkOutputForShapes( self: Pin<&mut INetworkDefinition>, tensor: Pin<&mut ITensor>, ) -> bool
Undo markOutputForShapes.
inputs to addShape cannot contain wildcard dimension values.
True if successful, false if tensor is not marked as an output.
Sourcepub fn addParametricReLU(
self: Pin<&mut INetworkDefinition>,
input: Pin<&mut ITensor>,
slope: Pin<&mut ITensor>,
) -> *mut IParametricReLULayer
pub fn addParametricReLU( self: Pin<&mut INetworkDefinition>, input: Pin<&mut ITensor>, slope: Pin<&mut ITensor>, ) -> *mut IParametricReLULayer
Add a parametric ReLU layer to the network.
inputThe input tensor to the layer.slopeThe slope tensor to the layer. This tensor should be unidirectionally broadcastable to the input tensor.
Tensors of type Int32, Int64, Bool, or UInt8 are not allowed as inputs.
The new parametric ReLU layer, or nullptr if it could not be created.
Sourcepub fn addConvolutionNd(
self: Pin<&mut INetworkDefinition>,
input: Pin<&mut ITensor>,
nbOutputMaps: i64,
kernelSize: &Dims64,
kernelWeights: Weights,
biasWeights: Weights,
) -> *mut IConvolutionLayer
pub fn addConvolutionNd( self: Pin<&mut INetworkDefinition>, input: Pin<&mut ITensor>, nbOutputMaps: i64, kernelSize: &Dims64, kernelWeights: Weights, biasWeights: Weights, ) -> *mut IConvolutionLayer
Add a multi-dimension convolution layer to the network.
inputThe input tensor to the convolution.nbOutputMapsThe number of output feature maps for the convolution.kernelSizeThe multi-dimensions of the convolution kernel.kernelWeightsThe kernel weights for the convolution.biasWeightsThe bias weights for the convolution. Weights{} represents no bias.
It is an error to specify a wildcard value for the ‘C’ dimension of the input tensor. Int32 tensors are not valid input tensors. Only 2D or 3D convolution is supported.
The new convolution layer, or nullptr if it could not be created.
Sourcepub fn addPoolingNd(
self: Pin<&mut INetworkDefinition>,
input: Pin<&mut ITensor>,
type_: PoolingType,
windowSize: &Dims64,
) -> *mut IPoolingLayer
pub fn addPoolingNd( self: Pin<&mut INetworkDefinition>, input: Pin<&mut ITensor>, type_: PoolingType, windowSize: &Dims64, ) -> *mut IPoolingLayer
Add a multi-dimension pooling layer to the network.
inputThe input tensor to the layer.typeThe type of pooling to apply.windowSizeThe size of the pooling window.
See IPoolingLayer PoolingType
Int32 tensors are not valid input tensors. Only 2D or 3D pooling is supported.
The new pooling layer, or nullptr if it could not be created.
Sourcepub fn addDeconvolutionNd(
self: Pin<&mut INetworkDefinition>,
input: Pin<&mut ITensor>,
nbOutputMaps: i64,
kernelSize: Dims64,
kernelWeights: Weights,
biasWeights: Weights,
) -> *mut IDeconvolutionLayer
pub fn addDeconvolutionNd( self: Pin<&mut INetworkDefinition>, input: Pin<&mut ITensor>, nbOutputMaps: i64, kernelSize: Dims64, kernelWeights: Weights, biasWeights: Weights, ) -> *mut IDeconvolutionLayer
The new deconvolution layer, or nullptr if it could not be created.
Sourcepub fn addScaleNd(
self: Pin<&mut INetworkDefinition>,
input: Pin<&mut ITensor>,
mode: ScaleMode,
shift: Weights,
scale: Weights,
power: Weights,
channelAxis: i32,
) -> *mut IScaleLayer
pub fn addScaleNd( self: Pin<&mut INetworkDefinition>, input: Pin<&mut ITensor>, mode: ScaleMode, shift: Weights, scale: Weights, power: Weights, channelAxis: i32, ) -> *mut IScaleLayer
Add a multi-dimension scale layer to the network.
inputThe input tensor to the layer.modeThe scaling mode.shiftThe shift value.scaleThe scale value.powerThe power value.channelAxisThe channel axis.
If the weights are available, then the size of weights are dependent on the ScaleMode. For ScaleMode::kUNIFORM, the number of weights equals 1. For ScaleMode::kCHANNEL, the number of weights equals the channel dimension. For ScaleMode::kELEMENTWISE, the number of weights equals the product of all input dimensions at channelAxis and beyond.
For example, if the inputs dimensions are [A,B,C,D,E,F], and channelAxis=2: For ScaleMode::kUNIFORM, the number of weights is equal to 1. For ScaleMode::kCHANNEL, the number of weights is C. For ScaleMode::kELEMENTWISE, the number of weights is CDE*F.
channelAxis can also be set explicitly using setChannelAxis().
See IScaleLayer
See [setChannelAxis()]
Int32 tensors are not valid input tensors. Only 2D or 3D scale is supported.
The new Scale layer, or nullptr if it could not be created.
Sourcepub fn addResize(
self: Pin<&mut INetworkDefinition>,
input: Pin<&mut ITensor>,
) -> *mut IResizeLayer
pub fn addResize( self: Pin<&mut INetworkDefinition>, input: Pin<&mut ITensor>, ) -> *mut IResizeLayer
Add a resize layer to the network.
inputThe input tensor to the layer.
See IResizeLayer
Int32 tensors are not valid input tensors.
The new resize layer, or nullptr if it could not be created.
Sourcepub fn addLoop(self: Pin<&mut INetworkDefinition>) -> *mut ILoop
pub fn addLoop(self: Pin<&mut INetworkDefinition>) -> *mut ILoop
Add a loop to the network.
An ILoop provides a way to specify a recurrent subgraph.
Pointer to ILoop that can be used to add loop-boundary layers for the loop.
See ILoop
Sourcepub fn addIfConditional(
self: Pin<&mut INetworkDefinition>,
) -> *mut IIfConditional
pub fn addIfConditional( self: Pin<&mut INetworkDefinition>, ) -> *mut IIfConditional
Add an if-then-else to the network.
An IIfConditional provides a way to conditionally execute parts of the network.
Pointer to the IIfConditional that can be used to add conditional-boundary layers for the if-then-else.
See IIfConditional
Sourcepub fn addSelect(
self: Pin<&mut INetworkDefinition>,
condition: Pin<&mut ITensor>,
thenInput: Pin<&mut ITensor>,
elseInput: Pin<&mut ITensor>,
) -> *mut ISelectLayer
pub fn addSelect( self: Pin<&mut INetworkDefinition>, condition: Pin<&mut ITensor>, thenInput: Pin<&mut ITensor>, elseInput: Pin<&mut ITensor>, ) -> *mut ISelectLayer
Add a select layer to the network.
conditionThe condition tensor to the layer. Must have type DataType::kBOOL.thenInputThe “then” input tensor to the layer.elseInputThe “else” input tensor to the layer.
All three input tensors must have the same rank, and along each axis must have the same length or a length of one. If the length is one, the tensor is broadcast along that axis. The output tensor has the dimensions of the inputs AFTER the broadcast rule is applied. For example, given:
dimensions of condition: [1,1,5,9] dimensions of thenInput: [1,1,5,9] dimensions of elseInput: [1,3,1,9]
the output dimensions are [1,3,5,9], and the output contents are defined by:
output[0,i,j,k] = condition[0,0,j,k] ? thenInput[0,0,j,k] : elseInput[0,i,0,k]
The output dimensions are not necessarily the max of the input dimensions if any input is an empty tensor. For example, if in the preceding example, 5 is changed to 0:
dimensions of condition: [1,1,0,9] dimensions of thenInput: [1,1,0,9] dimensions of elseInput: [1,3,1,9]
then the output dimensions are [1,3,0,9].
The inputs are shape tensors if the output is a shape tensor.
See ISelectLayer
The new select layer, or nullptr if it could not be created.
Sourcepub unsafe fn addAssertion(
self: Pin<&mut INetworkDefinition>,
condition: Pin<&mut ITensor>,
message: *const c_char,
) -> *mut IAssertionLayer
pub unsafe fn addAssertion( self: Pin<&mut INetworkDefinition>, condition: Pin<&mut ITensor>, message: *const c_char, ) -> *mut IAssertionLayer
Add an assertion layer to the network.
conditionThe input tensor to the layer.messageA message to print if the assertion fails.
See IAssertionLayer
The new assertion layer, or nullptr if it could not be created.
The input tensor must be a boolean shape tensor.
Sourcepub fn addPaddingNd(
self: Pin<&mut INetworkDefinition>,
input: Pin<&mut ITensor>,
prePadding: &Dims64,
postPadding: &Dims64,
) -> *mut IPaddingLayer
pub fn addPaddingNd( self: Pin<&mut INetworkDefinition>, input: Pin<&mut ITensor>, prePadding: &Dims64, postPadding: &Dims64, ) -> *mut IPaddingLayer
Add a padding layer to the network. Only 2D padding is currently supported.
inputThe input tensor to the layer.prePaddingThe padding to apply to the start of the tensor.postPaddingThe padding to apply to the end of the tensor.
See IPaddingLayer
The new padding layer, or nullptr if it could not be created.
Sourcepub unsafe fn setWeightsName(
self: Pin<&mut INetworkDefinition>,
weights: Weights,
name: *const c_char,
) -> bool
pub unsafe fn setWeightsName( self: Pin<&mut INetworkDefinition>, weights: Weights, name: *const c_char, ) -> bool
Associate a name with all current uses of the given weights.
The name must be set after the Weights are used in the network. Lookup is associative. The name applies to all Weights with matching type, value pointer, and count. If Weights with a matching value pointer, but different type or count exists in the network, an error message is issued, the name is rejected, and return false. If the name has already been used for other weights, return false. A nullptr causes the weights to become unnamed, i.e. clears any previous name.
weightsThe weights to be named.nameThe name to associate with the weights.
true on success.
The string name must be null-terminated, and be at most 4096 bytes including the terminator.
Sourcepub unsafe fn setErrorRecorder(
self: Pin<&mut INetworkDefinition>,
recorder: *mut IErrorRecorder,
)
pub unsafe fn setErrorRecorder( self: Pin<&mut INetworkDefinition>, recorder: *mut IErrorRecorder, )
See [getErrorRecorder()]
Sourcepub fn getErrorRecorder(self: &INetworkDefinition) -> *mut IErrorRecorder
pub fn getErrorRecorder(self: &INetworkDefinition) -> *mut IErrorRecorder
get the ErrorRecorder assigned to this interface.
Retrieves the assigned error recorder object for the given class. A nullptr will be returned if setErrorRecorder has not been called.
A pointer to the IErrorRecorder object that has been registered.
See [setErrorRecorder()]
Sourcepub fn addScatter(
self: Pin<&mut INetworkDefinition>,
data: Pin<&mut ITensor>,
indices: Pin<&mut ITensor>,
updates: Pin<&mut ITensor>,
mode: ScatterMode,
) -> *mut IScatterLayer
pub fn addScatter( self: Pin<&mut INetworkDefinition>, data: Pin<&mut ITensor>, indices: Pin<&mut ITensor>, updates: Pin<&mut ITensor>, mode: ScatterMode, ) -> *mut IScatterLayer
Add a Scatter layer to the network with specified mode and axis=0.
dataThe input tensor to be updated with additional values.indicesindices of the elements to be updated.updatesvalues to be used for updates.modescatter mode.
See IScatterLayer
indices tensor data type must be DataType::kINT32. updates tensor data type must be the same as data
The new Scatter layer, or nullptr if it could not be created.
Sourcepub fn addDynamicQuantize(
self: Pin<&mut INetworkDefinition>,
input: Pin<&mut ITensor>,
axis: i32,
blockSize: i32,
outputType: DataType,
scaleType: DataType,
) -> *mut IDynamicQuantizeLayer
pub fn addDynamicQuantize( self: Pin<&mut INetworkDefinition>, input: Pin<&mut ITensor>, axis: i32, blockSize: i32, outputType: DataType, scaleType: DataType, ) -> *mut IDynamicQuantizeLayer
Add a dynamic quantization layer to the network.
This layer performs dynamic block quantization of its input tensor and outputs the quantized data and the computed block scale-factors. The blocked axis dimension size must be divisible by the block size.
inputThe input tensor to be quantized. Its data type must be one of DataType::kFLOAT, DataType::kHALF, or DataType::kBF16. Currently only 2D and 3D inputs are supported.axisThe axis that is sliced into blocks. The axis must be the last or second to last dimension.blockSizeThe number of elements that are quantized using a shared scale factor. Valid values are 16 (NVFP4 quantization) and 32 (MXFP8 quantization).outputTypeThe data type of the quantized output tensor, must be DataType::kFP4 (NVFP4 quantization) or DataType::kFP8 (MXFP8 quantization). Future calls to set output type using setToType or setOutputType must be consistent.scaleTypeThe data type of the scale factor used for quantizing the input data, must be DataType::kFP8 (NVFP4 quantization) or DataType::kE8M0 (MXFP8 quantization).
The new dynamic quantization layer, or nullptr if it could not be created.
Sourcepub fn addDynamicQuantizeV2(
self: Pin<&mut INetworkDefinition>,
input: Pin<&mut ITensor>,
blockShape: &Dims64,
outputType: DataType,
scaleType: DataType,
) -> *mut IDynamicQuantizeLayer
pub fn addDynamicQuantizeV2( self: Pin<&mut INetworkDefinition>, input: Pin<&mut ITensor>, blockShape: &Dims64, outputType: DataType, scaleType: DataType, ) -> *mut IDynamicQuantizeLayer
Add a dynamic quantization layer to the network.
This layer performs dynamic block quantization of its input tensor and outputs the quantized data and the computed block scale factors.
inputThe input tensor to be quantized. Its data type must be one of DataType::kFLOAT, DataType::kHALF, or DataType::kBF16.blockShapeDefines the block shape for the quantization. Must match the input tensor rank.outputTypeThe data type of the quantized output tensor, must be DataType::kFP4, DataType::kFP8 or DataType::kINT8. Future calls to set output type using setToType or setOutputType must be consistent.scaleTypeThe data type of the scale factor used for quantizing the input data, must be DataType::kFP8, DataType::kE8M0 or DataType::kFLOAT.
The new dynamic quantization layer, or nullptr if it could not be created.
Sourcepub fn addGridSample(
self: Pin<&mut INetworkDefinition>,
input: Pin<&mut ITensor>,
grid: Pin<&mut ITensor>,
) -> *mut IGridSampleLayer
pub fn addGridSample( self: Pin<&mut INetworkDefinition>, input: Pin<&mut ITensor>, grid: Pin<&mut ITensor>, ) -> *mut IGridSampleLayer
Add a GridSample layer to the network.
inputThe input tensor to the layer.gridThe grid tensor to the layer.
See IGridSampleLayer
Creates a GridSample layer with a InterpolationMode::kLINEAR, unaligned corners, and SampleMode::kFILL for 4d-shape input tensors.
The new GridSample layer, or nullptr if it could not be created.
Sourcepub fn addNMS(
self: Pin<&mut INetworkDefinition>,
boxes: Pin<&mut ITensor>,
scores: Pin<&mut ITensor>,
maxOutputBoxesPerClass: Pin<&mut ITensor>,
) -> *mut INMSLayer
pub fn addNMS( self: Pin<&mut INetworkDefinition>, boxes: Pin<&mut ITensor>, scores: Pin<&mut ITensor>, maxOutputBoxesPerClass: Pin<&mut ITensor>, ) -> *mut INMSLayer
Add a non-maximum suppression layer to the network.
The default indices tensor (the first output) data type is DataType::kINT32.
-
boxesThe input boxes tensor to the layer. -
scoresThe input scores tensor to the layer. -
maxOutputBoxesPerClassThe input maxOutputBoxesPerClass tensor to the layer.
See INMSLayer
The new NMS layer, or nullptr if it could not be created.
Deprecated in TensorRT 10.14. Superseded by four-argument addNMS.
Sourcepub fn addReverseSequence(
self: Pin<&mut INetworkDefinition>,
input: Pin<&mut ITensor>,
sequenceLens: Pin<&mut ITensor>,
) -> *mut IReverseSequenceLayer
pub fn addReverseSequence( self: Pin<&mut INetworkDefinition>, input: Pin<&mut ITensor>, sequenceLens: Pin<&mut ITensor>, ) -> *mut IReverseSequenceLayer
Add a ReverseSequence layer to the network.
-
inputThe input tensor to the layer. Must have rank >= 2. -
sequenceLens1D tensor specifying lengths of sequences to reverse in a batch. The length of the sequenceLens tensor must be equal to the size of the dimension in input tensor specified by batchAxis.
The new ReverseSequence layer, or nullptr if it could not be created.
Sourcepub fn addNormalization(
self: Pin<&mut INetworkDefinition>,
input: Pin<&mut ITensor>,
scale: Pin<&mut ITensor>,
bias: Pin<&mut ITensor>,
axesMask: u32,
) -> *mut INormalizationLayer
pub fn addNormalization( self: Pin<&mut INetworkDefinition>, input: Pin<&mut ITensor>, scale: Pin<&mut ITensor>, bias: Pin<&mut ITensor>, axesMask: u32, ) -> *mut INormalizationLayer
Add a normalization layer to the network.
inputThe input tensor to the layer.scaleThe scale tensor used to scale the normalized output.biasThe bias tensor used to scale the normalized output.axesMaskThe axes on which to perform mean calculations. The bit in position i of bitmask axesMask corresponds to explicit dimension i of the result. E.g., the least significant bit corresponds to the first explicit dimension and the next to least significant bit corresponds to the second explicit dimension.
The normalization layer works by performing normalization of the tensor input on the specified axesMask. The result is then scaled by multiplying with scale and adding bias.
The shapes of scale and bias must be the same, and must have the same rank and be unidirectionally broadcastable to the shape of input. Given a 4D NCHW input tensor, the expected shapes for scale and bias are:
- [1, C, 1, 1] for InstanceNormalization
- [1, G, 1, 1] for GroupNormalization. Use addNormalizationV2() instead if [1, C, 1, 1] shapes for scale and bias are required.
The new normalization layer, or nullptr if it could not be created.
Deprecated in TensorRT 10.15. Superseded by addNormalizationV2().
Sourcepub fn addCumulative(
self: Pin<&mut INetworkDefinition>,
input: Pin<&mut ITensor>,
axis: Pin<&mut ITensor>,
operation: CumulativeOperation,
exclusive: bool,
reverse: bool,
) -> *mut ICumulativeLayer
pub fn addCumulative( self: Pin<&mut INetworkDefinition>, input: Pin<&mut ITensor>, axis: Pin<&mut ITensor>, operation: CumulativeOperation, exclusive: bool, reverse: bool, ) -> *mut ICumulativeLayer
Add a cumulative layer to the network.
inputThe input tensor to the layer.axisThe axis tensor to apply the cumulative operation on. Currently, it must be a build-time constant 0D shape tensor and must be in the range [-rank(input), rank(input)-1]. Negative value means counting dimensions from the back. -operationThe reduction operation to perform. -exclusiveThe boolean that specifies whether it is an exclusive cumulative or inclusive cumulative. -reverseThe boolean that specifies whether the cumulative operation should be applied backward.
The cumulative layer works by performing the specified cumulative operation to the tensor input on the axis specified by axis.
See ICumulativeLayer
The new cumulative layer, or nullptr if it could not be created.
Sourcepub fn addAttention(
self: Pin<&mut INetworkDefinition>,
query: Pin<&mut ITensor>,
key: Pin<&mut ITensor>,
value: Pin<&mut ITensor>,
normOp: AttentionNormalizationOp,
causal: bool,
) -> *mut IAttention
pub fn addAttention( self: Pin<&mut INetworkDefinition>, query: Pin<&mut ITensor>, key: Pin<&mut ITensor>, value: Pin<&mut ITensor>, normOp: AttentionNormalizationOp, causal: bool, ) -> *mut IAttention
Add an attention to the network.
queryA 3D or 4D input query tensor to the layer.keyA 3D or 4D input key tensor to the layer.valueA 3D or 4D input value tensor to the layer.normOpThe normalization operation to perform.causalUse causal inference or not. When true, uses kUPPER_LEFT causal masking.
For padded (BHND) form, query must have shape [batchSize, numHeadsQuery, sequenceLengthQuery, dimHead]. For packed (NHD) form, query must have shape [totalTokens, numHeadsQuery, dimHead]. key and value follow the same convention based on their form. Use IAttention::setQueryForm() and IAttention::setKeyValueForm() to configure the tensor layout. normOp defaults to kSOFTMAX isCausal defaults to false.
By default, IAttention is not decomposable and TensorRT will try to use a single fused kernel, which may be more efficient than if the subgraph is expressed without IAttention. Setting the IAttention to decomposable=True can allow IAttention to be to use multiple kernels if no fused kernel support found.
See IAttention
Deprecated in TensorRT 10.16. Superseded by addAttentionV2 with CausalMaskKind parameter.
The new attention, or nullptr if it could not be created.
Sourcepub fn addAttentionV2(
self: Pin<&mut INetworkDefinition>,
query: Pin<&mut ITensor>,
key: Pin<&mut ITensor>,
value: Pin<&mut ITensor>,
normOp: AttentionNormalizationOp,
causalKind: CausalMaskKind,
) -> *mut IAttention
pub fn addAttentionV2( self: Pin<&mut INetworkDefinition>, query: Pin<&mut ITensor>, key: Pin<&mut ITensor>, value: Pin<&mut ITensor>, normOp: AttentionNormalizationOp, causalKind: CausalMaskKind, ) -> *mut IAttention
Add an attention to the network with explicit causal mask kind.
queryA 4d input query tensor to the layer.keyA 4d input key tensor to the layer.valueA 4d input value tensor to the layer.normOpThe normalization operation to perform.causalKindThe causal mask alignment orientation. Use kNONE for no causal masking, kUPPER_LEFT for diagonal anchored at upper-left corner (legacy default), or kLOWER_RIGHT for diagonal anchored at lower-right corner (for LLM generation with s_q != s_kv).
query must have shape [batchSize, numHeadsQuery, sequenceLengthQuery, dimHead]. key and value must have shape [batchSize, numHeadsKeyValue, sequenceLengthKeyValue, dimHead]. pastKey and pastValue must have shape [batchSize, numHeadsKeyValue, sequenceLengthKeyValue, dimHead]. normOp defaults to kSOFTMAX, causalKind defaults to kNONE.
By default, IAttention is not decomposable and TensorRT will try to use a single fused kernel, which may be more efficient than if the subgraph is expressed without IAttention. Setting the IAttention to decomposable=True can allow IAttention to be to use multiple kernels if no fused kernel support found.
See IAttention, CausalMaskKind
The new attention, or nullptr if it could not be created.
Sourcepub fn addRotaryEmbedding(
self: Pin<&mut INetworkDefinition>,
input: Pin<&mut ITensor>,
cosCache: Pin<&mut ITensor>,
sinCache: Pin<&mut ITensor>,
interleaved: bool,
rotaryEmbeddingDim: i32,
) -> *mut IRotaryEmbeddingLayer
pub fn addRotaryEmbedding( self: Pin<&mut INetworkDefinition>, input: Pin<&mut ITensor>, cosCache: Pin<&mut ITensor>, sinCache: Pin<&mut ITensor>, interleaved: bool, rotaryEmbeddingDim: i32, ) -> *mut IRotaryEmbeddingLayer
Add a Rotary Position Embedding (RoPE) layer to the network.
inputThe input activation tensor to the layer. The shape must be (batchSize, numHeads, sequenceLength, headSize).cosCacheThe cosine cache tensor for use in RoPE computation. See the following explanation for the shape requirement.sinCacheThe sine cache tensor for use in RoPE computation. See the following explanation for the shape requirement.interleavedWhether the input is in interleaved format, i.e., whether the 2-d vectors rotated are taken from adjacent 2 elements in the hidden dimension.rotaryEmbeddingDimThe hidden dimension that participates in RoPE.
The RotaryEmbedding layer applies RoPE to the input, using cosCache and sinCache. An optional input, positionIds, can be provided using setInput with index 3. If provided, it is used to index into cosCache and sinCache.
If positionIds is not provided, cosCache and sinCache must have shape (batchSize, sequenceLength, headSize / 2) if rotaryEmbeddingDim is 0, or (batchSize, sequenceLength, rotaryEmbeddingDim / 2) otherwise. If positionIds is provided, cosCache and sinCache must have shape (maxPositionId+1, headSize / 2) if rotaryEmbeddingDim is 0, or (maxPositionId+1, rotaryEmbeddingDim / 2) otherwise. positionIds, if provided, must have shape (batchSize, sequenceLength).
The new RotaryEmbedding layer, or nullptr if it could not be created.
Sourcepub fn addKVCacheUpdate(
self: Pin<&mut INetworkDefinition>,
cache: Pin<&mut ITensor>,
update: Pin<&mut ITensor>,
writeIndices: Pin<&mut ITensor>,
cacheMode: KVCacheMode,
) -> *mut IKVCacheUpdateLayer
pub fn addKVCacheUpdate( self: Pin<&mut INetworkDefinition>, cache: Pin<&mut ITensor>, update: Pin<&mut ITensor>, writeIndices: Pin<&mut ITensor>, cacheMode: KVCacheMode, ) -> *mut IKVCacheUpdateLayer
Add a KVCacheUpdate layer to the network.
cacheThe key/value cache tensor for the layer. The user is responsible for properly allocating and binding the tensor memory.updateThe newly updated key/value tensor for the layer.writeIndicesThe write indices tensor for key/value cache updates.cacheModeThe mode of the KVCacheUpdate layer. For TensorRT 10.15, onlykLINEARmode is supported.
The expected tensor shapes are as follows:
cache: [batchSize, numHeads, maxSequenceLength, headSize]update: [batchSize, numHeads, sequenceLength, headSize]writeIndices: [batchSize]
The cache and update tensors must have the same data type, which can be DataType::kFLOAT,
DataType::kHALF, or DataType::kBF16. Quantized data types are not supported.
The writeIndices tensor must be DataType::kINT32 or DataType::kINT64.
The layer performs in-place updates on the cache tensor. Therefore, the user must ensure that
the cache tensor and the corresponding output tensor share the same device memory address
before execution.
In kLINEAR mode, each update must satisfy the condition
writeIndices[i] + sequenceLength <= maxSequenceLength. Out-of-bound updates will be ignored silently.
The new KVCacheUpdate layer, or nullptr if it could not be created.
Sourcepub fn addMoE(
self: Pin<&mut INetworkDefinition>,
hiddenStates: Pin<&mut ITensor>,
selectedExpertsForTokens: Pin<&mut ITensor>,
scoresForSelectedExperts: Pin<&mut ITensor>,
) -> *mut IMoELayer
pub fn addMoE( self: Pin<&mut INetworkDefinition>, hiddenStates: Pin<&mut ITensor>, selectedExpertsForTokens: Pin<&mut ITensor>, scoresForSelectedExperts: Pin<&mut ITensor>, ) -> *mut IMoELayer
Add a MoE (Mixture of Experts) layer to the network.
hiddenStatesThe hidden states tensor input to the MoE layer. Shape: [batchSize, seqLen, hiddenSize].selectedExpertsForTokensThe tensor containing expert indices selected for each token. Shape: [batchSize, seqLen, topK].scoresForSelectedExpertsThe tensor containing scores computed for the selected experts. Shape: [batchSize, seqLen, topK].
See IMoELayer
MoE requires Blackwell or Thor GPU architecture (SM 10.x or SM 11.x). SM 12.x is not currently supported. And performance is limited when seqLen > 16.
The number of selected experts per token could be inferred from the input selectedExpertsForTokens and should be consistent with the topK in the scoresForSelectedExperts.
The new MoE layer, or nullptr if it could not be created.
Sourcepub unsafe fn addDistCollective(
self: Pin<&mut INetworkDefinition>,
input: Pin<&mut ITensor>,
distCollectiveOp: CollectiveOperation,
reduceOp: ReduceOperation,
root: i64,
groups: *mut i64,
groupSize: i64,
) -> *mut IDistCollectiveLayer
pub unsafe fn addDistCollective( self: Pin<&mut INetworkDefinition>, input: Pin<&mut ITensor>, distCollectiveOp: CollectiveOperation, reduceOp: ReduceOperation, root: i64, groups: *mut i64, groupSize: i64, ) -> *mut IDistCollectiveLayer
Add a DistCollective layer to the network.
inputThe input tensor to the layer.distCollectiveOpThe collective operation to perform. See CollectiveOperation for valid values.reduceOpThe reduction operation to perform, in case the collective operation is reduction type: kREDUCE, kREDUCE_SCATTER or kALL_REDUCE. See ReduceOperation for valid values. Use ReduceOperation::kNONE for a CollectiveOperation which does not need a ReduceOperationrootThe root rank of the collective operation. Some CollectiveOperations require specifying a root rank, with the following semantics:- kBROADCAST: the root rank sends, all other ranks receive data
- kREDUCE: the root rank receives reduced data, the other ranks send data
- kGATHER: the root rank receives data gathered from all ranks
- kSCATTER: the root rank distributes data to all ranks
For operations that do not use a root rank (kALL_REDUCE, kALL_GATHER, kREDUCE_SCATTER, kALL_TO_ALL),
the
rootparameter is ignored. Useroot = -1as the recommended sentinel value when constructing the layer to make this explicit. groupsPointer to a flat array of rank IDs in the communicator that defines a single group for this layer. The DistCollective runner treats this array as the ordered list of participating ranks; only those ranks take part in the collective, and the order defines the group-local rank (used to remap the root for root-based ops).groupSizeThe number of elements in the groups array. If groupSize is 0, all ranks participate and groups can be nullptr. SeeIDistCollectiveLayer
The new DistCollective layer, or nullptr if it could not be created.
Sourcepub unsafe fn markWeightsRefittable(
self: Pin<&mut INetworkDefinition>,
name: *const c_char,
) -> bool
pub unsafe fn markWeightsRefittable( self: Pin<&mut INetworkDefinition>, name: *const c_char, ) -> bool
Mark weights as refittable when the builder flag kREFIT_INDIVIDUAL is set.
nameThe name of the weights.
True if the weights were successfully marked as refittable, false if the weights do not exist or cannot be refitted.
Sourcepub unsafe fn unmarkWeightsRefittable(
self: Pin<&mut INetworkDefinition>,
name: *const c_char,
) -> bool
pub unsafe fn unmarkWeightsRefittable( self: Pin<&mut INetworkDefinition>, name: *const c_char, ) -> bool
Unmark weights as refittable when the builder flag kREFIT_INDIVIDUAL is set.
nameThe name of the weights.
True if the weights were successfully marked as unrefittable, false if the weights do not exist.
Sourcepub unsafe fn areWeightsMarkedRefittable(
self: &INetworkDefinition,
name: *const c_char,
) -> bool
pub unsafe fn areWeightsMarkedRefittable( self: &INetworkDefinition, name: *const c_char, ) -> bool
Whether the weight has been marked as refittable.
nameThe name of the weights to check.
True if the weights are marked as refittable, false if the weights do not exist or are marked as non-refittable.
Sourcepub fn addSqueeze(
self: Pin<&mut INetworkDefinition>,
input: Pin<&mut ITensor>,
axes: Pin<&mut ITensor>,
) -> *mut ISqueezeLayer
pub fn addSqueeze( self: Pin<&mut INetworkDefinition>, input: Pin<&mut ITensor>, axes: Pin<&mut ITensor>, ) -> *mut ISqueezeLayer
Add a squeeze layer to the network.
inputThe input tensor to the layer.axesThe axes to remove unit dimensions on.
See ISqueezeLayer
Axes must be resolvable to a constant Int32 or Int64 1D shape tensor. Values in axes must be unique and in the range of [-r, r-1], where r is the rank of the input tensor. For each axis value, the corresponding dimension in the input tensor must be one.
The new Squeeze layer, or nullptr if it could not be created.
Sourcepub fn addUnsqueeze(
self: Pin<&mut INetworkDefinition>,
input: Pin<&mut ITensor>,
axes: Pin<&mut ITensor>,
) -> *mut IUnsqueezeLayer
pub fn addUnsqueeze( self: Pin<&mut INetworkDefinition>, input: Pin<&mut ITensor>, axes: Pin<&mut ITensor>, ) -> *mut IUnsqueezeLayer
Add an unsqueeze layer to the network.
inputThe input tensor to the layer.axesThe axes to add unit dimensions.
See IUnsqueezeLayer
Axes must be resolvable to a constant Int32 or Int64 shape tensor. Values in axes must be unique and in the range of [-r_final, r_final-1], where r_final is the sum of rank(input) and len(axes).
r_final must be less than Dims::MAX_DIMS.
The new Unsqueeze layer, or nullptr if it could not be created
Sourcepub fn addNormalizationV2(
self: Pin<&mut INetworkDefinition>,
input: Pin<&mut ITensor>,
scale: Pin<&mut ITensor>,
bias: Pin<&mut ITensor>,
axesMask: u32,
) -> *mut INormalizationLayer
pub fn addNormalizationV2( self: Pin<&mut INetworkDefinition>, input: Pin<&mut ITensor>, scale: Pin<&mut ITensor>, bias: Pin<&mut ITensor>, axesMask: u32, ) -> *mut INormalizationLayer
Add a normalization layer to the network.
inputThe input tensor to the layer.scaleThe scale tensor used to scale the normalized output.biasThe bias tensor used to scale the normalized output.axesMaskThe axes on which to perform mean calculations. The bit in position i of bitmask axesMask corresponds to explicit dimension i of the result. E.g., the least significant bit corresponds to the first explicit dimension and the next to least significant bit corresponds to the second explicit dimension.
The normalization layer works by performing normalization of the tensor input on the specified axesMask. The result is then scaled by multiplying with scale and adding bias.
The shapes of scale and bias are expected the be the same, and must have the same rank and be unidirectionally broadcastable to the shape of input. In the case of InstanceNorm or GroupNorm, the shapes of scale and bias are expected to be [1, C, 1, 1] in the case of a 4D NCHW input tensor.
The new normalization layer, or nullptr if it could not be created.