Skip to main content

BuilderFlag

Enum BuilderFlag 

Source
#[repr(i32)]
pub enum BuilderFlag {
Show 21 variants kDEBUG = 2, kGPU_FALLBACK = 3, kREFIT = 4, kDISABLE_TIMING_CACHE = 5, kTF32 = 6, kSPARSE_WEIGHTS = 7, kSAFETY_SCOPE = 8, kDIRECT_IO = 11, kVERSION_COMPATIBLE = 12, kEXCLUDE_LEAN_RUNTIME = 13, kERROR_ON_TIMING_CACHE_MISS = 15, kDISABLE_COMPILATION_CACHE = 17, kSTRIP_PLAN = 18, kREFIT_IDENTICAL = 19, kWEIGHT_STREAMING = 20, kREFIT_INDIVIDUAL = 22, kSTRICT_NANS = 23, kMONITOR_MEMORY = 24, kEDITABLE_TIMING_CACHE = 26, kDISTRIBUTIVE_INDEPENDENCE = 27, kREQUIRE_USER_ALLOCATION = 28,
}
Expand description

BuilderFlag

List of valid modes that the builder can enable when creating an engine from a network definition.

See [IBuilderConfig::setFlags()], IBuilderConfig::getFlags()

Variants§

§

kDEBUG = 2

Enable debugging of layers via synchronizing after every layer.

§

kGPU_FALLBACK = 3

Enable layers marked to execute on GPU if layer cannot execute on DLA.

§

kREFIT = 4

Enable building a refittable engine.

§

kDISABLE_TIMING_CACHE = 5

Disable reuse of timing information across identical layers.

§

kTF32 = 6

Allow (but not require) computations on tensors of type DataType::kFLOAT to use TF32. TF32 computes inner products by rounding the inputs to 10-bit mantissas before multiplying, but accumulates the sum using 23-bit mantissas. Enabled by default.

§

kSPARSE_WEIGHTS = 7

Allow the builder to examine weights and use optimized functions when weights have suitable sparsity.

§

kSAFETY_SCOPE = 8

Change the allowed parameters in the EngineCapability::kSTANDARD flow to match the restrictions that EngineCapability::kSAFETY check against for DeviceType::kGPU and EngineCapability::kDLA_STANDALONE check against the DeviceType::kDLA case. This flag is forced to true if EngineCapability::kSAFETY at build time if it is unset.

This flag is only supported in NVIDIA Drive(R) products.

Deprecated in TensorRT 10.16. In EngineCapability::kSTANDARD flow, safety restrictions are no longer supported. In EngineCapability::kSAFETY and EngineCapability::kDLA_STANDALONE flows, restrictions are enforced natively. This flag is retained for API compatibility but is ignored.

§

kDIRECT_IO = 11

Require that no reformats be inserted between a layer and a network I/O tensor for which ITensor::setAllowedFormats was called. Build fails if a reformat is required for functional correctness. Deprecated in TensorRT 10.7. Unneeded API.

§

kVERSION_COMPATIBLE = 12

Restrict to lean runtime operators to provide version forward compatibility for the plan.

This flag is only supported by NVIDIA Volta and later GPUs. This flag is not supported in NVIDIA Drive(R) products.

§

kEXCLUDE_LEAN_RUNTIME = 13

Exclude lean runtime from the plan when version forward compatability is enabled. By default, this flag is unset, so the lean runtime will be included in the plan.

If BuilderFlag::kVERSION_COMPATIBLE is not set then the value of this flag will be ignored.

§

kERROR_ON_TIMING_CACHE_MISS = 15

Emit error when a tactic being timed is not present in the timing cache. This flag has an effect only when IBuilderConfig has an associated ITimingCache.

§

kDISABLE_COMPILATION_CACHE = 17

Disable caching of JIT-compilation results during engine build. By default, JIT-compiled code will be serialized as part of the timing cache, which may significantly increase the cache size. Setting this flag prevents the code from being serialized. This flag has an effect only when BuilderFlag::DISABLE_TIMING_CACHE is not set.

§

kSTRIP_PLAN = 18

Strip the refittable weights from the engine plan file.

§

kREFIT_IDENTICAL = 19

Create a refittable engine under the assumption that the refit weights will be identical to those provided at build time. The resulting engine will have the same performance as a non-refittable one. All refittable weights can be refitted through the refit API, but if the refit weights are not identical to the build-time weights, behavior is undefined. When used alongside ‘kSTRIP_PLAN’, this flag will result in a small plan file for which weights are later supplied via refitting. This enables use of a single set of weights with different inference backends, or with TensorRT plans for multiple GPU architectures.

§

kWEIGHT_STREAMING = 20

Enable weight streaming for the current engine.

Weight streaming from the host enables execution of models that do not fit in GPU memory by allowing TensorRT to intelligently stream network weights from the CPU DRAM. Please see ICudaEngine::getMinimumWeightStreamingBudget for the default memory budget when this flag is enabled.

Enabling this feature changes the behavior of IRuntime::deserializeCudaEngine to allocate the entire network’s weights on the CPU DRAM instead of GPU memory. Then, ICudaEngine::createExecutionContext will determine the optimal split of weights between the CPU and GPU and place weights accordingly.

Future TensorRT versions may enable this flag by default.

Enabling this flag may marginally increase build time.

Enabling this feature will significantly increase the latency of ICudaEngine::createExecutionContext.

See [IRuntime::deserializeCudaEngine], ICudaEngine::getMinimumWeightStreamingBudget, ICudaEngine::setWeightStreamingBudget

§

kREFIT_INDIVIDUAL = 22

Enable building a refittable engine and provide fine-grained control. This allows control over which weights are refittable or not using INetworkDefinition::markWeightsRefittable and INetworkDefinition::unmarkWeightsRefittable. By default, all weights are non-refittable when this flag is enabled. This flag cannot be used together with kREFIT or kREFIT_IDENTICAL.

§

kSTRICT_NANS = 23

Disable floating-point optimizations: 0*x => 0, x-x => 0, or x/x => 1. These identities are not true when x is a NaN or Inf, and thus might hide propagation or generation of NaNs. This flag is typically used in combination with kSPARSE_WEIGHTS. There are three valid sparsity configurations.

  1. Disable all sparsity. Both kSPARSE_WEIGHTS and kSTRICT_NANS are unset
  2. Enable sparsity only where it does not affect propagation/generation of NaNs. Both kSPARSE_WEIGHTS and kSTRICT_NANS are set
  3. Enable all sparsity. kSPARSE_WEIGHTS is set and kSTRICT_NANS is unset
§

kMONITOR_MEMORY = 24

Enable memory monitor during build time.

§

kEDITABLE_TIMING_CACHE = 26

Enable editable timing cache.

§

kDISTRIBUTIVE_INDEPENDENCE = 27

Enable distributive independence. When BuilderFlag::kDISTRIBUTIVE_INDEPENDENCE is set and a layer documents axis i of an output as a distributive axis, then the layer behaves exactly as if each evaluation across axis i was done using identical operations. The definition of distributive axis is as follows: For IMatrixMultiplyLayer: All axes that are not one of the vector or matrix dimensions are distributive axes. For layers that perform reduction: All non-reduction axes are distributive axes. For layers that perform einsum: Let n be the leftmost reduction axis. The axes to the left of n are distributive axes.

§

kREQUIRE_USER_ALLOCATION = 28

Build an engine that requires user allocation when creating an execution context. This means that runtime allocation will not be enabled even when the tensor dimensions exceed the limits for static allocation, and ensures that inference will support graph capture unless the network includes operations such as data-dependent dynamic shapes (INonZeroLayer, ITripLimitLayer, etc.) that require runtime allocation. If such operations are present, the engine build will fail with an error message.

Trait Implementations§

Source§

impl Clone for BuilderFlag

Source§

fn clone(&self) -> BuilderFlag

Returns a duplicate of the value. Read more
1.0.0 (const: unstable) · Source§

fn clone_from(&mut self, source: &Self)

Performs copy-assignment from source. Read more
Source§

impl Eq for BuilderFlag

Source§

impl ExternType for BuilderFlag

Source§

type Id = (n, v, i, n, f, e, r, _1, (), B, u, i, l, d, e, r, F, l, a, g)

A type-level representation of the type’s C++ namespace and type name. Read more
Source§

type Kind = Trivial

Source§

impl From<BuilderFlag> for BuilderFlag

Source§

fn from(value: BuilderFlag) -> Self

Converts to this type from the input type.
Source§

impl Hash for BuilderFlag

Source§

fn hash<__H: Hasher>(&self, state: &mut __H)

Feeds this value into the given Hasher. Read more
1.3.0 · Source§

fn hash_slice<H>(data: &[Self], state: &mut H)
where H: Hasher, Self: Sized,

Feeds a slice of this type into the given Hasher. Read more
Source§

impl Into<BuilderFlag> for BuilderFlag

Source§

fn into(self) -> BuilderFlag

Converts this type into the (usually inferred) input type.
Source§

impl PartialEq for BuilderFlag

Source§

fn eq(&self, other: &BuilderFlag) -> bool

Tests for self and other values to be equal, and is used by ==.
1.0.0 (const: unstable) · Source§

fn ne(&self, other: &Rhs) -> bool

Tests for !=. The default implementation is almost always sufficient, and should not be overridden without very good reason.
Source§

impl SharedPtrTarget for BuilderFlag

Source§

impl StructuralPartialEq for BuilderFlag

Source§

impl UniquePtrTarget for BuilderFlag

Source§

impl VectorElement for BuilderFlag

Source§

impl WeakPtrTarget for BuilderFlag

Auto Trait Implementations§

Blanket Implementations§

Source§

impl<T> Any for T
where T: 'static + ?Sized,

Source§

fn type_id(&self) -> TypeId

Gets the TypeId of self. Read more
Source§

impl<T> Borrow<T> for T
where T: ?Sized,

Source§

fn borrow(&self) -> &T

Immutably borrows from an owned value. Read more
Source§

impl<T> BorrowMut<T> for T
where T: ?Sized,

Source§

fn borrow_mut(&mut self) -> &mut T

Mutably borrows from an owned value. Read more
Source§

impl<T> CloneToUninit for T
where T: Clone,

Source§

unsafe fn clone_to_uninit(&self, dest: *mut u8)

🔬This is a nightly-only experimental API. (clone_to_uninit)
Performs copy-assignment from self to dest. Read more
Source§

impl<T> From<T> for T

Source§

fn from(t: T) -> T

Returns the argument unchanged.

Source§

impl<T, U> Into<U> for T
where U: From<T>,

Source§

fn into(self) -> U

Calls U::from(self).

That is, this conversion is whatever the implementation of From<T> for U chooses to do.

Source§

impl<T> ToOwned for T
where T: Clone,

Source§

type Owned = T

The resulting type after obtaining ownership.
Source§

fn to_owned(&self) -> T

Creates owned data from borrowed data, usually by cloning. Read more
Source§

fn clone_into(&self, target: &mut T)

Uses borrowed data to replace owned data, usually by cloning. Read more
Source§

impl<T, U> TryFrom<U> for T
where U: Into<T>,

Source§

type Error = Infallible

The type returned in the event of a conversion error.
Source§

fn try_from(value: U) -> Result<T, <T as TryFrom<U>>::Error>

Performs the conversion.
Source§

impl<T, U> TryInto<U> for T
where U: TryFrom<T>,

Source§

type Error = <U as TryFrom<T>>::Error

The type returned in the event of a conversion error.
Source§

fn try_into(self) -> Result<U, <U as TryFrom<T>>::Error>

Performs the conversion.
Source§

impl<T> WithinBoxTrivial for T
where T: ExternType<Kind = Trivial> + Unpin,

Source§

fn within_box(self) -> Pin<Box<T>>

Source§

impl<T> WithinUniquePtrTrivial for T
where T: UniquePtrTarget + ExternType<Kind = Trivial> + Unpin,