Skip to main content

IBuilderConfig

Struct IBuilderConfig 

Source
pub struct IBuilderConfig { /* private fields */ }
Expand description

IBuilderConfig

Holds properties for configuring a builder to produce an engine.

See BuilderFlags

Implementations§

Source§

impl IBuilderConfig

Source

pub fn getAvgTimingIterations(self: &IBuilderConfig) -> i32

Query the number of averaging iterations.

By default the number of averaging iterations is 1.

See [setAvgTimingIterations()]

Source

pub fn setEngineCapability( self: Pin<&mut IBuilderConfig>, capability: EngineCapability, )

Configure the builder to target specified EngineCapability flow.

The flow means a sequence of API calls that allow an application to set up a runtime, engine, and execution context in order to run inference.

The supported flows are specified in the EngineCapability enum.

Source

pub fn getEngineCapability(self: &IBuilderConfig) -> EngineCapability

Query EngineCapability flow configured for the builder.

By default it returns EngineCapability::kSTANDARD.

See [setEngineCapability()]

Source

pub fn setFlags(self: Pin<&mut IBuilderConfig>, builderFlags: u32)

Set the build mode flags to turn on builder options for this network.

The flags are listed in the BuilderFlags enum. The flags set configuration options to build the network.

  • builderFlags The build option for an engine.

This function will override the previous set flags, rather than bitwise ORing the new flag.

See [getFlags()]

Source

pub fn getFlags(self: &IBuilderConfig) -> u32

Get the build mode flags for this builder config. Defaults to 0.

The build options as a bitmask.

See [setFlags()]

Source

pub fn clearFlag(self: Pin<&mut IBuilderConfig>, builderFlag: BuilderFlag)

clear a single build mode flag.

clears the builder mode flag from the enabled flags.

See [setFlags()]

Source

pub fn setFlag(self: Pin<&mut IBuilderConfig>, builderFlag: BuilderFlag)

Set a single build mode flag.

Add the input builder mode flag to the already enabled flags.

See [setFlags()]

Source

pub fn getFlag(self: &IBuilderConfig, builderFlag: BuilderFlag) -> bool

Returns true if the build mode flag is set

See [getFlags()]

True if flag is set, false if unset.

Source

pub unsafe fn setDeviceType( self: Pin<&mut IBuilderConfig>, layer: *const ILayer, deviceType: DeviceType, )

Set the device that this layer must execute on.

  • layer which layer to execute.
  • deviceType that this layer must execute on. If DeviceType is not set or is reset, TensorRT will use the default DeviceType set in the builder.

The device type for a layer must be compatible with the safety flow (if specified). For example a layer cannot be marked for DLA execution while the builder is configured for kSAFETY.

See [getDeviceType()]

Source

pub unsafe fn getDeviceType( self: &IBuilderConfig, layer: *const ILayer, ) -> DeviceType

Get the device that this layer executes on.

Returns DeviceType of the layer.

Source

pub unsafe fn isDeviceTypeSet( self: &IBuilderConfig, layer: *const ILayer, ) -> bool

whether the DeviceType has been explicitly set for this layer

true if device type is not default

See [setDeviceType()] getDeviceType() resetDeviceType()

Source

pub unsafe fn resetDeviceType( self: Pin<&mut IBuilderConfig>, layer: *const ILayer, )

reset the DeviceType for this layer

See [setDeviceType()] getDeviceType() isDeviceTypeSet()

Source

pub unsafe fn canRunOnDLA(self: &IBuilderConfig, layer: *const ILayer) -> bool

Checks if a layer can run on DLA.

status true if the layer can on DLA else returns false.

Source

pub fn setDLACore(self: Pin<&mut IBuilderConfig>, dlaCore: i32)

Sets the DLA core used by the network. Defaults to -1.

  • dlaCore The DLA core to execute the engine on, in the range [0,getNbDlaCores()).

This function is used to specify which DLA core to use via indexing, if multiple DLA cores are available.

if getNbDLACores() returns 0, then this function does nothing.

See IRuntime::setDLACore() getDLACore()

Source

pub fn getDLACore(self: &IBuilderConfig) -> i32

Get the DLA core that the engine executes on.

assigned DLA core or -1 for DLA not present or unset.

Source

pub fn setDefaultDeviceType( self: Pin<&mut IBuilderConfig>, deviceType: DeviceType, )

Sets the default DeviceType to be used by the builder. It ensures that all the layers that can run on this device will run on it, unless setDeviceType is used to override the default DeviceType for a layer.

See [getDefaultDeviceType()]

Source

pub fn getDefaultDeviceType(self: &IBuilderConfig) -> DeviceType

Get the default DeviceType which was set by setDefaultDeviceType.

By default it returns DeviceType::kGPU.

Source

pub fn reset(self: Pin<&mut IBuilderConfig>)

Resets the builder configuration to defaults.

Useful for initializing a builder config object to its original state.

Source

pub unsafe fn setProfileStream( self: Pin<&mut IBuilderConfig>, stream: *mut CUstream_st, )

Set the CUDA stream that is used to profile this network.

  • stream The CUDA stream used for profiling by the builder.

See [getProfileStream()]

Source

pub fn getProfileStream(self: &IBuilderConfig) -> *mut CUstream_st

Get the CUDA stream that is used to profile this network.

The CUDA stream set by setProfileStream, nullptr if setProfileStream has not been called.

See [setProfileStream()]

Source

pub unsafe fn addOptimizationProfile( self: Pin<&mut IBuilderConfig>, profile: *const IOptimizationProfile, ) -> i32

Add an optimization profile.

This function must be called at least once if the network has dynamic or shape input tensors. This function may be called at most once when building a refittable engine, as more than a single optimization profile are not supported for refittable engines.

  • profile The new optimization profile, which must satisfy profile->isValid() == true

The index of the optimization profile (starting from 0) if the input is valid, or -1 if the input is not valid.

Source

pub fn getNbOptimizationProfiles(self: &IBuilderConfig) -> i32

Get number of optimization profiles.

This is one higher than the index of the last optimization profile that has be defined (or zero, if none has been defined yet).

The number of the optimization profiles.

Source

pub fn setProfilingVerbosity( self: Pin<&mut IBuilderConfig>, verbosity: ProfilingVerbosity, )

Set verbosity level of layer information exposed in NVTX annotations and IEngineInspector.

Control how much layer information will be exposed in NVTX annotations and IEngineInspector.

See ProfilingVerbosity, getProfilingVerbosity(), IEngineInspector

Source

pub fn getProfilingVerbosity(self: &IBuilderConfig) -> ProfilingVerbosity

Get verbosity level of layer information exposed in NVTX annotations and IEngineInspector.

Get the current setting of verbosity level of layer information exposed in NVTX annotations and IEngineInspector. Default value is ProfilingVerbosity::kLAYER_NAMES_ONLY.

See ProfilingVerbosity, setProfilingVerbosity(), IEngineInspector

Source

pub fn setTacticSources( self: Pin<&mut IBuilderConfig>, tacticSources: u32, ) -> bool

Set tactic sources.

This bitset controls which tactic sources TensorRT is allowed to use for tactic selection.

Multiple tactic sources may be combined with a bitwise OR operation. For example, to enable edge mask convolutions and JIT convolutions as tactic sources, use a value of:

1U << static_cast<uint32_t>(TacticSource::kEDGE_MASK_CONVOLUTIONS) | 1U << static_cast<uint32_t>(TacticSource::kJIT_CONVOLUTIONS)

See [getTacticSources]

true if the tactic sources in the build configuration were updated. The tactic sources in the build configuration will not be updated if the provided value is invalid.

Source

pub fn getTacticSources(self: &IBuilderConfig) -> u32

Get tactic sources.

Get the tactic sources currently set in the engine build configuration.

See [setTacticSources()]

tactic sources

Source

pub unsafe fn createTimingCache( self: &IBuilderConfig, blob: *const c_void, size: usize, ) -> *mut ITimingCache

Create timing cache

Create ITimingCache instance from serialized raw data. The created timing cache doesn’t belong to a specific IBuilderConfig. It can be shared by multiple builder instances. Call setTimingCache() before launching a builder to attach cache to builder instance. The lifetime of the ITimingCache must exceed the lifetime of all builders that use it.

  • blob A pointer to the raw data that contains serialized timing cache
  • size The size in bytes of the serialized timing cache. Size 0 means create a new cache from scratch

See [setTimingCache]

the pointer to ITimingCache created

Deprecated in TensorRT-RTX 1.2. Timing cache operations are no-ops in TensorRT-RTX.

Source

pub fn setTimingCache( self: Pin<&mut IBuilderConfig>, cache: &ITimingCache, ignoreMismatch: bool, ) -> bool

Attach a timing cache to IBuilderConfig

The timing cache has verification header to make sure the provided cache can be used in current environment. A failure will be reported if the CUDA device property in the provided cache is different from current environment. ignoreMismatch = true skips strict verification and allows loading cache created from a different device.

The cache must not be destroyed until after the engine is built.

  • cache the timing cache to be used
  • ignoreMismatch whether or not allow using a cache that contains different CUDA device property

true if set successfully, false otherwise

Using cache generated from devices with different CUDA device properties may lead to functional/performance bugs.

Deprecated in TensorRT-RTX 1.2. Timing cache operations are no-ops in TensorRT-RTX.

Source

pub fn getTimingCache(self: &IBuilderConfig) -> *const ITimingCache

Get the pointer to the timing cache from current IBuilderConfig

pointer to the timing cache used in current IBuilderConfig

Deprecated in TensorRT-RTX 1.2. Timing cache operations are no-ops in TensorRT-RTX.

Source

pub fn setMemoryPoolLimit( self: Pin<&mut IBuilderConfig>, pool: MemoryPoolType, poolSize: usize, )

Set the memory size for the memory pool.

TensorRT layers access different memory pools depending on the operation. This function sets in the IBuilderConfig the size limit, specified by poolSize, for the corresponding memory pool, specified by pool. TensorRT will build a plan file that is constrained by these limits or report which constraint caused the failure.

If the size of the pool, specified by poolSize, fails to meet the size requirements for the pool, this function does nothing and emits the recoverable error, ErrorCode::kINVALID_ARGUMENT, to the registered IErrorRecorder.

If the size of the pool is larger than the maximum possible value for the configuration, this function does nothing and emits ErrorCode::kUNSUPPORTED_STATE.

If the pool does not exist on the requested device type when building the network, a warning is emitted to the logger, and the memory pool value is ignored.

Refer to MemoryPoolType to see the size requirements for each pool.

  • pool The memory pool to limit the available memory for.
  • poolSize The size of the pool in bytes.

See [getMemoryPoolLimit], MemoryPoolType

Source

pub fn getMemoryPoolLimit(self: &IBuilderConfig, pool: MemoryPoolType) -> usize

Get the memory size limit of the memory pool.

Retrieve the memory size limit of the corresponding pool in bytes. If setMemoryPoolLimit for the pool has not been called, this returns the default value used by TensorRT. This default value is not necessarily the maximum possible value for that configuration.

  • pool The memory pool to get the limit for.

  • Returns The size of the memory limit, in bytes, for the corresponding pool.

See [setMemoryPoolLimit]

Source

pub fn setPreviewFeature( self: Pin<&mut IBuilderConfig>, feature: PreviewFeature, enable: bool, )

Enable or disable a specific preview feature

Allows enabling or disabling experimental features, which are not enabled by default in the current release.

Refer to PreviewFeature for additional information, and a list of the available features.

  • feature the feature to enable / disable
  • enable true for enable, false for disable

See PreviewFeature, getPreviewFeature

Source

pub fn getPreviewFeature(self: &IBuilderConfig, feature: PreviewFeature) -> bool

Get status of preview feature

  • feature the feature to query

  • Returns true if the feature is enabled, false otherwise

See PreviewFeature, setPreviewFeature

Source

pub fn setBuilderOptimizationLevel(self: Pin<&mut IBuilderConfig>, level: i32)

Set builder optimization level

Set the builder optimization level. Setting a higher optimization level allows the optimizer to spend more time searching for optimization opportunities. The resulting engine may have better performance compared to an engine built with a lower optimization level.

The default optimization level is 3. Valid values include integers from 0 to the maximum optimization level, which is currently 5. Setting it to greater than the maximum level results in behavior identical to the maximum level.

Below are the descriptions about each builder optimization level:

  • Level 0: This enables the fastest compilation by disabling dynamic kernel generation and selecting the first tactic that succeeds in execution. This will also not respect a timing cache.

  • Level 1: Available tactics are sorted by heuristics, but only the top are tested to select the best. If a dynamic kernel is generated its compile optimization is low.

  • Level 2: Available tactics are sorted by heuristics, but only the fastest tactics are tested to select the best.

  • Level 3: Apply heuristics to see if a static precompiled kernel is applicable or if a new one has to be compiled dynamically.

  • Level 4: Always compiles a dynamic kernel.

  • Level 5: Always compiles a dynamic kernel and compares it to static kernels.

  • level The optimization level to set to. Must be non-negative.

See [getBuilderOptimizationLevel]

Source

pub fn getBuilderOptimizationLevel(self: Pin<&mut IBuilderConfig>) -> i32

Get builder optimization level

  • Returns the current builder optimization level

See [setBuilderOptimizationLevel]

Source

pub fn setHardwareCompatibilityLevel( self: Pin<&mut IBuilderConfig>, hardwareCompatibilityLevel: HardwareCompatibilityLevel, )

Set the hardware compatibility level.

Hardware compatibility allows an engine to run on GPU architectures other than that of the GPU where the engine was built.

The default hardware compatibility level is HardwareCompatibilityLevel::kNONE.

  • hardwareCompatibilityLevel The level of hardware compatibility.
Source

pub fn getHardwareCompatibilityLevel( self: &IBuilderConfig, ) -> HardwareCompatibilityLevel

Get the hardware compatibility level.

hardwareCompatibilityLevel The level of hardware compatibility.

See [setHardwareCompatibilityLevel()]

Source

pub fn getPluginToSerialize(self: &IBuilderConfig, index: i32) -> *const c_char

Get the plugin library path to be serialized with version-compatible engines.

  • index Index of the plugin library path in the list. Should be in the range [0, getNbPluginsToSerialize()).

The path to the plugin library.

Source

pub fn getNbPluginsToSerialize(self: &IBuilderConfig) -> i32

Get the number of plugin library paths to be serialized with version-compatible engines.

The number of paths.

Source

pub fn setMaxAuxStreams(self: Pin<&mut IBuilderConfig>, nbStreams: i32) -> bool

Set the maximum number of auxiliary streams that TRT is allowed to use.

If the network contains operators that can run in parallel, TRT can execute them using auxiliary streams in addition to the one provided to the IExecutionContext::enqueueV3() call.

The default maximum number of auxiliary streams is determined by the heuristics in TensorRT on whether enabling multi-stream would improve the performance. This behavior can be overridden by calling this API to set the maximum number of auxiliary streams explicitly. Set this to 0 to enforce single-stream inference.

The resulting engine may use fewer auxiliary streams than the maximum if the network does not contain enough parallelism or if TensorRT determines that using more auxiliary streams does not help improve the performance.

Allowing more auxiliary streams does not always give better performance since there will be synchronizations overhead between streams. Using CUDA graphs at runtime can help reduce the overhead caused by cross-stream synchronizations.

Using more auxiliary leads to more memory usage at runtime since some activation memory blocks will not be able to be reused.

  • nbStreams The maximum number of auxiliary streams that TRT is allowed to use. Must be non-negative.

true if the value was set successfully, false if nbStreams is negative.

See [getMaxAuxStreams()], ICudaEngine::getNbAuxStreams(), IExecutionContext::setAuxStreams()

Source

pub fn getMaxAuxStreams(self: &IBuilderConfig) -> i32

Get the maximum number of auxiliary streams that TRT is allowed to use.

See [setMaxAuxStreams()]

Source

pub unsafe fn setProgressMonitor( self: Pin<&mut IBuilderConfig>, monitor: *mut IProgressMonitor, )

Sets the progress monitor for building a network.

  • monitor The progress monitor to assign to the IBuilderConfig.

The progress monitor signals to the application when different phases of the compiler are being executed. Setting to nullptr unsets the monitor so that the application is not signaled.

See IBuilderConfig::getProgressMonitor

Source

pub fn getProgressMonitor(self: &IBuilderConfig) -> *mut IProgressMonitor

The progress monitor set by the application or nullptr.

See IBuilderConfig::setProgressMonitor

Source

pub fn setRuntimePlatform( self: Pin<&mut IBuilderConfig>, runtimePlatform: RuntimePlatform, )

Set the target platform for runtime execution.

Cross-platform compatibility allows an engine to be built and executed on different platforms.

The default cross-platform target is RuntimePlatform::kSAME_AS_BUILD.

  • runtimePlatform The target platform for runtime execution.

See IBuilderConfig::getRuntimePlatform()

Source

pub fn getRuntimePlatform(self: &IBuilderConfig) -> RuntimePlatform

Get the target platform for runtime execution.

The target platform for runtime execution.

See IBuilderConfig::setRuntimePlatform()

Source

pub fn setMaxNbTactics(self: Pin<&mut IBuilderConfig>, maxNbTactics: i32)

Set the maximum number of tactics to time when there is a choice of tactics.

This function controls the number of tactics timed when there are multiple tactics to choose from.

See [getMaxNbTactics()]

Source

pub fn getMaxNbTactics(self: &IBuilderConfig) -> i32

Query the maximum number of tactics timed when there is a choice.

By default the value is -1, indicating TensorRT can determine the number of tactics based on its own heuristic.

See [setMaxNbTactics()]

Source

pub fn setTilingOptimizationLevel( self: Pin<&mut IBuilderConfig>, level: TilingOptimizationLevel, ) -> bool

Set the Tiling optimization level.

Tiling allows TensorRT to try an on-chip caching strategy.

The default getTilingOptimizationLevel is TilingOptimizationLevel::kNONE.

  • level The level of Tiling optimization.

True if successful, false otherwise

Source

pub fn getTilingOptimizationLevel( self: &IBuilderConfig, ) -> TilingOptimizationLevel

Get the Tiling optimization level.

TilingOptimizationLevel The level of Tiling optimization.

See [setTilingOptimizationLevel()]

Source

pub fn setL2LimitForTiling(self: Pin<&mut IBuilderConfig>, size: i64) -> bool

Set the L2 cache usage limit for Tiling optimization.

Parameter for tiling optimization. This API only takes effect when TilingOptimizationLevel is not kNONE. If setL2LimitForTiling() has not been called, TensorRT would choose a default value between 0 and L2 capacity size.

  • size The size of the L2 cache usage limit for Tiling optimization.

True if successful, false otherwise

Source

pub fn getL2LimitForTiling(self: &IBuilderConfig) -> i64

Get the L2 cache usage limit for tiling optimization.

L2 cache usage limit for tiling optimization.

See [setL2LimitForTiling()]

Source

pub fn setNbComputeCapabilities( self: Pin<&mut IBuilderConfig>, maxNbComputeCapabilities: i32, ) -> bool

Set the number of compute capabilities

Default number of compute capabilities is 0. If the new number of compute capabilities is less than the current number, setNbComputeCapabilities() removes elements beyond the new number, and not modifying the existing elements. If the new number of compute capabilities is greater than the current number, new elements are added to the end of the vector, and the new elements are set to ComputeCapability::kNONE.

  • maxNbComputeCapabilities The maximum number of compute capabilities to set.

True if successful, false otherwise

See IBuilderConfig::getNbComputeCapabilities()

Source

pub fn getNbComputeCapabilities(self: &IBuilderConfig) -> i32

Get the number of compute capabilities

The number of compute capabilities.

See IBuilderConfig::setNbComputeCapabilities()

Source

pub fn setComputeCapability( self: Pin<&mut IBuilderConfig>, computeCapability: ComputeCapability, index: i32, ) -> bool

Set one compute capability for runtime execution.

If computeCapability is ComputeCapability::kCURRENT, the index must be 0, and the number of compute capabilities must be 1.

  • computeCapability The compute capability to set.
  • index The index at which to set the compute capability.

True if successful, false otherwise

See IBuilderConfig::getComputeCapability()

Source

pub fn getComputeCapability( self: &IBuilderConfig, index: i32, ) -> ComputeCapability

Get one compute capability for runtime execution.

  • index The index of the compute capability to get.

The compute capability at the specified index.

See IBuilderConfig::setComputeCapability()

Source

pub fn setAvgTimingIterations(self: Pin<&mut IBuilderConfig>, avgTiming: i32)

Set the number of averaging iterations used when timing layers.

When timing layers, the builder minimizes over a set of average times for layer execution. This parameter controls the number of iterations used in averaging.

See [getAvgTimingIterations()]

Trait Implementations§

Source§

impl Drop for IBuilderConfig

Source§

fn drop(self: &mut IBuilderConfig)

Executes the destructor for this type. Read more
Source§

fn pin_drop(self: Pin<&mut Self>)

🔬This is a nightly-only experimental API. (pin_ergonomics)
Execute the destructor for this type, but different to Drop::drop, it requires self to be pinned. Read more
Source§

impl ExternType for IBuilderConfig

Source§

type Id = (n, v, i, n, f, e, r, _1, (), I, B, u, i, l, d, e, r, C, o, n, f, i, g)

A type-level representation of the type’s C++ namespace and type name. Read more
Source§

type Kind = Opaque

Source§

impl MakeCppStorage for IBuilderConfig

Source§

unsafe fn allocate_uninitialized_cpp_storage() -> *mut IBuilderConfig

Allocates heap space for this type in C++ and return a pointer to that space, but do not initialize that space (i.e. do not yet call a constructor). Read more
Source§

unsafe fn free_uninitialized_cpp_storage(arg0: *mut IBuilderConfig)

Frees a C++ allocation which has not yet had a constructor called. Read more
Source§

impl SharedPtrTarget for IBuilderConfig

Source§

impl UniquePtrTarget for IBuilderConfig

Source§

impl WeakPtrTarget for IBuilderConfig

Auto Trait Implementations§

Blanket Implementations§

Source§

impl<T> Any for T
where T: 'static + ?Sized,

Source§

fn type_id(&self) -> TypeId

Gets the TypeId of self. Read more
Source§

impl<T> Borrow<T> for T
where T: ?Sized,

Source§

fn borrow(&self) -> &T

Immutably borrows from an owned value. Read more
Source§

impl<T> BorrowMut<T> for T
where T: ?Sized,

Source§

fn borrow_mut(&mut self) -> &mut T

Mutably borrows from an owned value. Read more
Source§

impl<T> From<T> for T

Source§

fn from(t: T) -> T

Returns the argument unchanged.

Source§

impl<T, U> Into<U> for T
where U: From<T>,

Source§

fn into(self) -> U

Calls U::from(self).

That is, this conversion is whatever the implementation of From<T> for U chooses to do.

Source§

impl<T, U> TryFrom<U> for T
where U: Into<T>,

Source§

type Error = Infallible

The type returned in the event of a conversion error.
Source§

fn try_from(value: U) -> Result<T, <T as TryFrom<U>>::Error>

Performs the conversion.
Source§

impl<T, U> TryInto<U> for T
where U: TryFrom<T>,

Source§

type Error = <U as TryFrom<T>>::Error

The type returned in the event of a conversion error.
Source§

fn try_into(self) -> Result<U, <U as TryFrom<T>>::Error>

Performs the conversion.