#[repr(C)]pub struct LiteOptions {Show 20 fields
pub weight_preprocess: c_int,
pub fuse_preprocess: c_int,
pub fake_next_exec: c_int,
pub var_sanity_check_first_run: c_int,
pub const_shape: c_int,
pub force_dynamic_alloc: c_int,
pub force_output_dynamic_alloc: c_int,
pub force_output_use_user_specified_memory: c_int,
pub no_profiling_on_shape_change: c_int,
pub jit_level: c_int,
pub comp_node_seq_record_level: c_int,
pub graph_opt_level: c_int,
pub async_exec_level: c_int,
pub enable_nchw44: c_int,
pub enable_nchw44_dot: c_int,
pub enable_nchw88: c_int,
pub enable_nhwcd4: c_int,
pub enable_nchw4: c_int,
pub enable_nchw32: c_int,
pub enable_nchw64: c_int,
}Expand description
\brief the inference options which will be translated to megenine
\param weight_preprocess is the option wich optimize the inferece performance with preprocess the const weights
\param fuse_preprocess fuse preprocess patten, like astype + pad_channel + dimshuffle
\param fake_next_exec whether only to perform non-computing tasks (like memory allocation and queue initialization) for next exec. This would be reset to false when the graph is executed.
\param var_sanity_check_first_run Disable var sanity check on the first run. Var sanity check is enabled on the first-time execution by default, and can be used to find some potential memory access errors in the operator implementation.
\param const_shape This can be used to reduce memory usage since some static inference data structures can be omitted.
\param force_dynamic_alloc force dynamic memory alloc for all vars
\param force_output_dynamic_alloc force dynamic memory alloc for output vars which are used as CallbackCaller input when call compile() function
\param no_profiling_on_shape_change do not re-profile to select best impl algo when input shape changes (use previous algo)
\param jit_level Execute supported operators with JIT (support MLIR, NVRTC). Can only be used on Nvidia GPUs, this value indicates JIT level: 1 for basic elemwise opr; 2 for including reduce operator
\param record_level flag optimize the inference performace with record the kernel tasks in first run, hereafter the inference all need to execute the recorded tasks. level = 0 means the normal inference, level = 1 means use record inference, level = 2 means record inference with free the extra memory
\param graph_opt_level optimization level: 0: disable 1: level-1: inplace arith transformations during graph construction 2: level-2: level-1, plus global optimization before graph compiling 3: also enable JIT <0: corresponding level, with result check for debug
\param async_exec_level exec: dispatch on separate threads for different comp_node. 0: do not perform async dispatch 1: dispatch async if there are more than one comp node with limited queue mask 0b10: async if there are multiple comp nodes with mask 0b100: always async
Fields§
§weight_preprocess: c_int§fuse_preprocess: c_int§fake_next_exec: c_int§var_sanity_check_first_run: c_int§const_shape: c_int§force_dynamic_alloc: c_int§force_output_dynamic_alloc: c_int§force_output_use_user_specified_memory: c_int§no_profiling_on_shape_change: c_int§jit_level: c_int§comp_node_seq_record_level: c_int§graph_opt_level: c_int§async_exec_level: c_int§enable_nchw44: c_int! layout transform options
enable_nchw44_dot: c_int§enable_nchw88: c_int§enable_nhwcd4: c_int§enable_nchw4: c_int§enable_nchw32: c_int§enable_nchw64: c_intTrait Implementations§
Source§impl Clone for LiteOptions
impl Clone for LiteOptions
Source§fn clone(&self) -> LiteOptions
fn clone(&self) -> LiteOptions
1.0.0 · Source§fn clone_from(&mut self, source: &Self)
fn clone_from(&mut self, source: &Self)
source. Read more