Struct spirv_cross2::compile::msl::CompilerOptions
source · #[non_exhaustive]pub struct CompilerOptions {Show 61 fields
pub common: CommonOptions,
pub version: MslVersion,
pub texel_buffer_texture_width: u32,
pub swizzle_buffer_index: u32,
pub indirect_params_buffer_index: u32,
pub shader_output_buffer_index: u32,
pub shader_patch_output_buffer_index: u32,
pub shader_tess_factor_output_buffer_index: u32,
pub buffer_size_buffer_index: u32,
pub view_mask_buffer_index: u32,
pub dynamic_offsets_buffer_index: u32,
pub shader_input_buffer_index: u32,
pub shader_index_buffer_index: u32,
pub shader_patch_input_buffer_index: u32,
pub shader_input_workgroup_index: u32,
pub enable_point_size_builtin: bool,
pub enable_frag_depth_builtin: bool,
pub enable_frag_stencil_ref_builtin: bool,
pub disable_rasterization: bool,
pub capture_output_to_buffer: bool,
pub swizzle_texture_samples: bool,
pub pad_fragment_output_components: bool,
pub tess_domain_origin_lower_left: bool,
pub platform: MetalPlatform,
pub argument_buffers: bool,
pub argument_buffers_tier: ArgumentBuffersTier,
pub texture_buffer_native: bool,
pub multiview: bool,
pub multiview_layered_rendering: bool,
pub device_index: u32,
pub view_index_from_device_index: bool,
pub dispatch_base: bool,
pub texture_1d_as_2d: bool,
pub enable_base_index_zero: bool,
pub framebuffer_fetch_subpass: bool,
pub invariant_fp_math: bool,
pub emulate_cubemap_array: bool,
pub enable_decoration_binding: bool,
pub force_active_argument_buffer_resources: bool,
pub force_native_arrays: bool,
pub enable_frag_output_mask: u32,
pub enable_clip_distance_user_varying: bool,
pub multi_patch_workgroup: bool,
pub vertex_for_tessellation: bool,
pub vertex_index_type: IndexType,
pub arrayed_subpass_input: bool,
pub r32ui_linear_texture_alignment: u32,
pub r32ui_alignment_constant_id: u32,
pub ios_use_simdgroup_functions: bool,
pub emulate_subgroups: bool,
pub fixed_subgroup_size: u32,
pub force_sample_rate_shading: bool,
pub ios_support_base_vertex_instance: bool,
pub raw_buffer_tese_input: bool,
pub manual_helper_invocation_updates: bool,
pub check_discarded_frag_stores: bool,
pub sample_dref_lod_array_as_grad: bool,
pub readwrite_texture_fences: bool,
pub replace_recursive_inputs: bool,
pub agx_manual_cube_grad_fixup: bool,
pub force_fragment_with_side_effects_execution: bool,
}msl only.Expand description
MSL compiler options
Fields (Non-exhaustive)§
This struct is marked as non-exhaustive
Struct { .. } syntax; cannot be matched against without a wildcard ..; and struct update syntax will not work.common: CommonOptionsCompile options common to GLSL, HLSL, and MSL.
version: MslVersionThe MSL version to compile to.
Defaults to MSL 1.2.
texel_buffer_texture_width: u32Width of 2D Metal textures used as 1D texel buffers.
swizzle_buffer_index: u32Index of the swizzle buffer.
The default is 30.
indirect_params_buffer_index: u32Index of the indirect params buffer.
The default is 29.
shader_output_buffer_index: u32Index of the shader output buffer.
The default is 28.
shader_patch_output_buffer_index: u32Index of the shader patch output buffer.
The default is 27.
shader_tess_factor_output_buffer_index: u32Index of the shader tesselation factor output buffer.
The default is 26.
buffer_size_buffer_index: u32Index of the buffer size buffer.
The default is 25.
view_mask_buffer_index: u32Index of the view mask buffer.
The default is 24
dynamic_offsets_buffer_index: u32Index of the dynamic offsets buffer.
The default is 23
shader_input_buffer_index: u32Index of the shader input buffer.
The default is 22.
shader_index_buffer_index: u32Index of the shader index buffer.
The default is 21.
shader_patch_input_buffer_index: u32Index of the shader patch input buffer.
The default is 20.
shader_input_workgroup_index: u32Index of the input workgroup index buffer.
The default is 0
enable_point_size_builtin: boolEnable point_size builtin.
enable_frag_depth_builtin: boolEnable the FragDepth builtin.
Disable if pipeline does not enable depth, as pipeline creation might otherwise fail.
enable_frag_stencil_ref_builtin: boolEnable the FragStencilRef output.
Disablle if pipeline does not enable stencil output, as pipeline creation might otherwise fail.
disable_rasterization: bool§capture_output_to_buffer: boolWrites geometry varyings to a buffer instead of as stage-outputs.
swizzle_texture_samples: boolWorks around lack of support for VkImageView component swizzles. Recent Metal versions do not require this workaround. This has a massive impact on performance and bloat.
Do not use this unless you are absolutely forced to.
To use this feature, the API side must pass down swizzle buffers. Should only be used by translation layers as a last resort.
pad_fragment_output_components: boolAlways emit color outputs as 4-component variables.
In Metal, the fragment shader must emit at least as many components as the render target format.
tess_domain_origin_lower_left: boolUse a lower-left tessellation domain.
platform: MetalPlatformThe plattform to output MSL for. Defaults to macOS.
argument_buffers: boolEnable use of Metal argument buffers.
MSL 2.0 or higher must be used.
argument_buffers_tier: ArgumentBuffersTierDefines Metal argument buffer tier levels.
Uses same values as Metal MTLArgumentBuffersTier enumeration.
texture_buffer_native: boolRequires MSL 2.1, use the native support for texel buffers.
multiview: boolEnable SPV_KHR_multiview emulation.
multiview_layered_rendering: boolIf disabled, don’t set [[render_target_array_index]] in multiview shaders.
Useful for devices which don’t support layered rendering.
Only effective when CompilerOptions::multiview is enabled.
device_index: u32The index of the device
view_index_from_device_index: boolTreat the view index as the device index instead. For multi-GPU rendering.
dispatch_base: boolAdd support for vkCmdDispatchBase() or similar APIs.
Offsets the workgroup ID based on a buffer.
texture_1d_as_2d: boolEmit Image variables of dimension Dim1D as texture2d.
In Metal, 1D textures do not support all features that 2D textures do.
Use this option if your code relies on these features.
enable_base_index_zero: boolEnsures vertex and instance indices start at zero.
This reflects the behavior of HLSL with SV_VertexID and SV_InstanceID.
framebuffer_fetch_subpass: boolUse Metal’s native frame-buffer fetch API for subpass inputs.
invariant_fp_math: boolEnables use of “fma” intrinsic for invariant float math
emulate_cubemap_array: boolEmulate texturecube_array with texture2d_array for iOS where this type is not available
enable_decoration_binding: boolAllow user to enable decoration binding
force_active_argument_buffer_resources: boolForces all resources which are part of an argument buffer to be considered active.
This ensures ABI compatibility between shaders where some resources might be unused, and would otherwise declare a different ABI.
force_native_arrays: boolForces the use of plain arrays, which works around certain driver bugs on certain versions of Intel Macbooks.
See https://github.com/KhronosGroup/SPIRV-Cross/issues/1210. May reduce performance in scenarios where arrays are copied around as value-types.
enable_frag_output_mask: u32Only selectively enable fragment outputs.
Useful if pipeline does not enable fragment output for certain locations, as pipeline creation might otherwise fail.
enable_clip_distance_user_varying: boolIf a shader writes clip distance, also emit user varyings which can be read in subsequent stages.
multi_patch_workgroup: boolIn a tessellation control shader, assume that more than one patch can be processed in a single workgroup. This requires changes to the way the InvocationId and PrimitiveId builtins are processed, but should result in more efficient usage of the GPU.
vertex_for_tessellation: boolIf set, a vertex shader will be compiled as part of a tessellation pipeline. It will be translated as a compute kernel, so it can use the global invocation ID to index the output buffer.
vertex_index_type: IndexTypeThe type of index in the index buffer, if present. For a compute shader, Metal requires specifying the indexing at pipeline creation, rather than at draw time as with graphics pipelines. This means we must create three different pipelines, for no indexing, 16-bit indices, and 32-bit indices. Each requires different handling for the gl_VertexIndex builtin. We may as well, then, create three different shaders for these three scenarios.
arrayed_subpass_input: boolAssume that SubpassData images have multiple layers. Layered input attachments are addressed relative to the Layer output from the vertex pipeline. This option has no effect with multiview, since all input attachments are assumed to be layered and will be addressed using the current ViewIndex.
r32ui_linear_texture_alignment: u32The required alignment of linear textures of format MTLPixelFormatR32Uint.
This is used to align the row stride for atomic accesses to such images.
r32ui_alignment_constant_id: u32The function constant ID to use for the linear texture alignment.
On MSL 1.2 or later, you can override the alignment by setting this function constant.
ios_use_simdgroup_functions: boolWhether to use SIMD-group or quadgroup functions to implement group non-uniform operations. Some GPUs on iOS do not support the SIMD-group functions, only the quadgroup functions.
emulate_subgroups: boolIf set, the subgroup size will be assumed to be one, and subgroup-related builtins and operations will be emitted accordingly.
This mode is intended to be used by MoltenVK on hardware/software configurations which do not provide sufficient support for subgroups.
fixed_subgroup_size: u32If nonzero, a fixed subgroup size to assume. Metal, similarly to VK_EXT_subgroup_size_control, allows the SIMD-group size (aka thread execution width) to vary depending on register usage and requirements.
In certain circumstances–for example, a pipeline
in MoltenVK without VK_PIPELINE_SHADER_STAGE_CREATE_ALLOW_VARYING_SUBGROUP_SIZE_BIT_EXT–
this is undesirable. This fixes the value of the SubgroupSize builtin, instead of
mapping it to the Metal builtin [[thread_execution_width]]. If the thread
execution width is reduced, the extra invocations will appear to be inactive.
If zero, the SubgroupSize will be allowed to vary, and the builtin will be mapped
to the Metal [[thread_execution_width]] builtin.
force_sample_rate_shading: boolIf set, a dummy [[sample_id]] input is added to a fragment shader if none is present.
This will force the shader to run at sample rate, assuming Metal does not optimize the extra threads away.
ios_support_base_vertex_instance: boolSpecifies whether the iOS target version supports the [[base_vertex]]
and [[base_instance]] attributes.
raw_buffer_tese_input: boolUse storage buffers instead of vertex-style attributes for tessellation evaluation input.
This may require conversion of inputs in the generated post-tessellation vertex shader, but allows the use of nested arrays.
manual_helper_invocation_updates: boolIf set, gl_HelperInvocation will be set manually whenever a fragment is discarded.
Some Metal devices have a bug where simd_is_helper_thread() does not return true
after a fragment has been discarded.
This is a workaround that is only expected to be needed until the bug is fixed in Metal; it is provided as an option to allow disabling it when that occurs.
check_discarded_frag_stores: boolIf set, extra checks will be emitted in fragment shaders to prevent writes from discarded fragments. Some Metal devices have a bug where writes to storage resources from discarded fragment threads continue to occur, despite the fragment being discarded.
This is a workaround that is only expected to be needed until the bug is fixed in Metal; it is provided as an option so it can be enabled only when the bug is present.
sample_dref_lod_array_as_grad: boolIf set, Lod operands to OpImageSample*DrefExplicitLod for 1D and 2D array images will be implemented using a gradient instead of passing the level operand directly.
Some Metal devices have a bug where the level() argument to depth2d_array<T>::sample_compare()
in a fragment shader is biased by some unknown amount, possibly dependent on the
partial derivatives of the texture coordinates.
This is a workaround that is only expected to be needed until the bug is fixed in Metal; it is provided as an option so it can be enabled only when the bug is present.
readwrite_texture_fences: boolMSL doesn’t guarantee coherence between writes and subsequent reads of read_write textures. This inserts fences before each read of a read_write texture to ensure coherency. If you’re sure you never rely on this, you can set this to false for a possible performance improvement. Note: Only Apple’s GPU compiler takes advantage of the lack of coherency, so make sure to test on Apple GPUs if you disable this.
replace_recursive_inputs: boolMetal 3.1 introduced a Metal regression bug which causes infinite recursion during Metal’s analysis of an entry point input structure that is itself recursive. Enabling this option will replace the recursive input declaration with a alternate variable of type void*, and then cast to the correct type at the top of the entry point function. The bug has been reported to Apple, and will hopefully be fixed in future releases.
agx_manual_cube_grad_fixup: boolIf set, manual fixups of gradient vectors for cube texture lookups will be performed. All released Apple Silicon GPUs to date behave incorrectly when sampling a cube texture with explicit gradients. They will ignore one of the three partial derivatives based on the selected major axis, and expect the remaining derivatives to be partially transformed.
force_fragment_with_side_effects_execution: boolMetal will discard fragments with side effects under certain circumstances prematurely.
Example: CTS test dEQP-VK.fragment_operations.early_fragment.discard_no_early_fragment_tests_depth
Test will render a full screen quad with varying depth [0,1] for each fragment.
Each fragment will do an operation with side effects, modify the depth value and
discard the fragment. The test expects the fragment to be run due to:
https://registry.khronos.org/vulkan/specs/1.0-extensions/html/vkspec.html#fragops-shader-depthreplacement
which states that the fragment shader must be run due to replacing the depth in shader.
However, Metal may prematurely discards fragments without executing them (I believe this to be due to a greedy optimization on their end) making the test fail.
This option enforces fragment execution for such cases where the fragment has operations with side effects. Provided as an option hoping Metal will fix this issue in the future.