Struct CompilerOptions

source

#[non_exhaustive]
pub struct CompilerOptions {Show 61 fields
    pub common: CommonOptions,
    pub version: MslVersion,
    pub texel_buffer_texture_width: u32,
    pub swizzle_buffer_index: u32,
    pub indirect_params_buffer_index: u32,
    pub shader_output_buffer_index: u32,
    pub shader_patch_output_buffer_index: u32,
    pub shader_tess_factor_output_buffer_index: u32,
    pub buffer_size_buffer_index: u32,
    pub view_mask_buffer_index: u32,
    pub dynamic_offsets_buffer_index: u32,
    pub shader_input_buffer_index: u32,
    pub shader_index_buffer_index: u32,
    pub shader_patch_input_buffer_index: u32,
    pub shader_input_workgroup_index: u32,
    pub enable_point_size_builtin: bool,
    pub enable_frag_depth_builtin: bool,
    pub enable_frag_stencil_ref_builtin: bool,
    pub disable_rasterization: bool,
    pub capture_output_to_buffer: bool,
    pub swizzle_texture_samples: bool,
    pub pad_fragment_output_components: bool,
    pub tess_domain_origin_lower_left: bool,
    pub platform: MetalPlatform,
    pub argument_buffers: bool,
    pub argument_buffers_tier: ArgumentBuffersTier,
    pub texture_buffer_native: bool,
    pub multiview: bool,
    pub multiview_layered_rendering: bool,
    pub device_index: u32,
    pub view_index_from_device_index: bool,
    pub dispatch_base: bool,
    pub texture_1d_as_2d: bool,
    pub enable_base_index_zero: bool,
    pub framebuffer_fetch_subpass: bool,
    pub invariant_fp_math: bool,
    pub emulate_cubemap_array: bool,
    pub enable_decoration_binding: bool,
    pub force_active_argument_buffer_resources: bool,
    pub force_native_arrays: bool,
    pub enable_frag_output_mask: u32,
    pub enable_clip_distance_user_varying: bool,
    pub multi_patch_workgroup: bool,
    pub vertex_for_tessellation: bool,
    pub vertex_index_type: IndexType,
    pub arrayed_subpass_input: bool,
    pub r32ui_linear_texture_alignment: u32,
    pub r32ui_alignment_constant_id: u32,
    pub ios_use_simdgroup_functions: bool,
    pub emulate_subgroups: bool,
    pub fixed_subgroup_size: u32,
    pub force_sample_rate_shading: bool,
    pub ios_support_base_vertex_instance: bool,
    pub raw_buffer_tese_input: bool,
    pub manual_helper_invocation_updates: bool,
    pub check_discarded_frag_stores: bool,
    pub sample_dref_lod_array_as_grad: bool,
    pub readwrite_texture_fences: bool,
    pub replace_recursive_inputs: bool,
    pub agx_manual_cube_grad_fixup: bool,
    pub force_fragment_with_side_effects_execution: bool,
}

Available on crate feature msl only.

Expand description

MSL compiler options

Fields (Non-exhaustive)§

This struct is marked as non-exhaustive

Non-exhaustive structs could have additional fields added in future. Therefore, non-exhaustive structs cannot be constructed in external crates using the traditional Struct { .. } syntax; cannot be matched against without a wildcard ..; and struct update syntax will not work.

§common: CommonOptions

Compile options common to GLSL, HLSL, and MSL.

§version: MslVersion

The MSL version to compile to.

Defaults to MSL 1.2.

§texel_buffer_texture_width: u32

Width of 2D Metal textures used as 1D texel buffers.

§swizzle_buffer_index: u32

Index of the swizzle buffer.

The default is 30.

§indirect_params_buffer_index: u32

Index of the indirect params buffer.

The default is 29.

§shader_output_buffer_index: u32

Index of the shader output buffer.

The default is 28.

§shader_patch_output_buffer_index: u32

Index of the shader patch output buffer.

The default is 27.

§shader_tess_factor_output_buffer_index: u32

Index of the shader tesselation factor output buffer.

The default is 26.

§buffer_size_buffer_index: u32

Index of the buffer size buffer.

The default is 25.

§view_mask_buffer_index: u32

Index of the view mask buffer.

The default is 24

§dynamic_offsets_buffer_index: u32

Index of the dynamic offsets buffer.

The default is 23

§shader_input_buffer_index: u32

Index of the shader input buffer.

The default is 22.

§shader_index_buffer_index: u32

Index of the shader index buffer.

The default is 21.

§shader_patch_input_buffer_index: u32

Index of the shader patch input buffer.

The default is 20.

§shader_input_workgroup_index: u32

Index of the input workgroup index buffer.

The default is 0

§enable_point_size_builtin: bool

Enable point_size builtin.

§enable_frag_depth_builtin: bool

Enable the FragDepth builtin.

Disable if pipeline does not enable depth, as pipeline creation might otherwise fail.

§enable_frag_stencil_ref_builtin: bool

Enable the FragStencilRef output.

Disablle if pipeline does not enable stencil output, as pipeline creation might otherwise fail.

§disable_rasterization: bool

§capture_output_to_buffer: bool

Writes geometry varyings to a buffer instead of as stage-outputs.

§swizzle_texture_samples: bool

Works around lack of support for VkImageView component swizzles. Recent Metal versions do not require this workaround. This has a massive impact on performance and bloat.

Do not use this unless you are absolutely forced to.

To use this feature, the API side must pass down swizzle buffers. Should only be used by translation layers as a last resort.

§pad_fragment_output_components: bool

Always emit color outputs as 4-component variables.

In Metal, the fragment shader must emit at least as many components as the render target format.

§tess_domain_origin_lower_left: bool

Use a lower-left tessellation domain.

§platform: MetalPlatform

The plattform to output MSL for. Defaults to macOS.

§argument_buffers: bool

Enable use of Metal argument buffers.

MSL 2.0 or higher must be used.

§argument_buffers_tier: ArgumentBuffersTier

Defines Metal argument buffer tier levels. Uses same values as Metal MTLArgumentBuffersTier enumeration.

§texture_buffer_native: bool

Requires MSL 2.1, use the native support for texel buffers.

§multiview: bool

Enable SPV_KHR_multiview emulation.

§multiview_layered_rendering: bool

If disabled, don’t set [[render_target_array_index]] in multiview shaders.

Useful for devices which don’t support layered rendering.

Only effective when CompilerOptions::multiview is enabled.

§device_index: u32

The index of the device

§view_index_from_device_index: bool

Treat the view index as the device index instead. For multi-GPU rendering.

§dispatch_base: bool

Add support for vkCmdDispatchBase() or similar APIs.

Offsets the workgroup ID based on a buffer.

§texture_1d_as_2d: bool

Emit Image variables of dimension Dim1D as texture2d.

In Metal, 1D textures do not support all features that 2D textures do.

Use this option if your code relies on these features.

§enable_base_index_zero: bool

Ensures vertex and instance indices start at zero.

This reflects the behavior of HLSL with SV_VertexID and SV_InstanceID.

§framebuffer_fetch_subpass: bool

Use Metal’s native frame-buffer fetch API for subpass inputs.

§invariant_fp_math: bool

Enables use of “fma” intrinsic for invariant float math

§emulate_cubemap_array: bool

Emulate texturecube_array with texture2d_array for iOS where this type is not available

§enable_decoration_binding: bool

Allow user to enable decoration binding

§force_active_argument_buffer_resources: bool

Forces all resources which are part of an argument buffer to be considered active.

This ensures ABI compatibility between shaders where some resources might be unused, and would otherwise declare a different ABI.

§force_native_arrays: bool

Forces the use of plain arrays, which works around certain driver bugs on certain versions of Intel Macbooks.

See https://github.com/KhronosGroup/SPIRV-Cross/issues/1210. May reduce performance in scenarios where arrays are copied around as value-types.

§enable_frag_output_mask: u32

Only selectively enable fragment outputs.

Useful if pipeline does not enable fragment output for certain locations, as pipeline creation might otherwise fail.

§enable_clip_distance_user_varying: bool

If a shader writes clip distance, also emit user varyings which can be read in subsequent stages.

§multi_patch_workgroup: bool

In a tessellation control shader, assume that more than one patch can be processed in a single workgroup. This requires changes to the way the InvocationId and PrimitiveId builtins are processed, but should result in more efficient usage of the GPU.

§vertex_for_tessellation: bool

If set, a vertex shader will be compiled as part of a tessellation pipeline. It will be translated as a compute kernel, so it can use the global invocation ID to index the output buffer.

§vertex_index_type: IndexType

The type of index in the index buffer, if present. For a compute shader, Metal requires specifying the indexing at pipeline creation, rather than at draw time as with graphics pipelines. This means we must create three different pipelines, for no indexing, 16-bit indices, and 32-bit indices. Each requires different handling for the gl_VertexIndex builtin. We may as well, then, create three different shaders for these three scenarios.

§arrayed_subpass_input: bool

Assume that SubpassData images have multiple layers. Layered input attachments are addressed relative to the Layer output from the vertex pipeline. This option has no effect with multiview, since all input attachments are assumed to be layered and will be addressed using the current ViewIndex.

§r32ui_linear_texture_alignment: u32

The required alignment of linear textures of format MTLPixelFormatR32Uint.

This is used to align the row stride for atomic accesses to such images.

§r32ui_alignment_constant_id: u32

The function constant ID to use for the linear texture alignment.

On MSL 1.2 or later, you can override the alignment by setting this function constant.

§ios_use_simdgroup_functions: bool

Whether to use SIMD-group or quadgroup functions to implement group non-uniform operations. Some GPUs on iOS do not support the SIMD-group functions, only the quadgroup functions.

§emulate_subgroups: bool

If set, the subgroup size will be assumed to be one, and subgroup-related builtins and operations will be emitted accordingly.

This mode is intended to be used by MoltenVK on hardware/software configurations which do not provide sufficient support for subgroups.

§fixed_subgroup_size: u32

If nonzero, a fixed subgroup size to assume. Metal, similarly to VK_EXT_subgroup_size_control, allows the SIMD-group size (aka thread execution width) to vary depending on register usage and requirements.

In certain circumstances–for example, a pipeline in MoltenVK without VK_PIPELINE_SHADER_STAGE_CREATE_ALLOW_VARYING_SUBGROUP_SIZE_BIT_EXT– this is undesirable. This fixes the value of the SubgroupSize builtin, instead of mapping it to the Metal builtin [[thread_execution_width]]. If the thread execution width is reduced, the extra invocations will appear to be inactive.

If zero, the SubgroupSize will be allowed to vary, and the builtin will be mapped to the Metal [[thread_execution_width]] builtin.

§force_sample_rate_shading: bool

If set, a dummy [[sample_id]] input is added to a fragment shader if none is present.

This will force the shader to run at sample rate, assuming Metal does not optimize the extra threads away.

§ios_support_base_vertex_instance: bool

Specifies whether the iOS target version supports the [[base_vertex]] and [[base_instance]] attributes.

§raw_buffer_tese_input: bool

Use storage buffers instead of vertex-style attributes for tessellation evaluation input.

This may require conversion of inputs in the generated post-tessellation vertex shader, but allows the use of nested arrays.

§manual_helper_invocation_updates: bool

If set, gl_HelperInvocation will be set manually whenever a fragment is discarded. Some Metal devices have a bug where simd_is_helper_thread() does not return true after a fragment has been discarded.

This is a workaround that is only expected to be needed until the bug is fixed in Metal; it is provided as an option to allow disabling it when that occurs.

§check_discarded_frag_stores: bool

If set, extra checks will be emitted in fragment shaders to prevent writes from discarded fragments. Some Metal devices have a bug where writes to storage resources from discarded fragment threads continue to occur, despite the fragment being discarded.

This is a workaround that is only expected to be needed until the bug is fixed in Metal; it is provided as an option so it can be enabled only when the bug is present.

§sample_dref_lod_array_as_grad: bool

If set, Lod operands to OpImageSample*DrefExplicitLod for 1D and 2D array images will be implemented using a gradient instead of passing the level operand directly.

Some Metal devices have a bug where the level() argument to depth2d_array<T>::sample_compare() in a fragment shader is biased by some unknown amount, possibly dependent on the partial derivatives of the texture coordinates.

This is a workaround that is only expected to be needed until the bug is fixed in Metal; it is provided as an option so it can be enabled only when the bug is present.

§readwrite_texture_fences: bool

MSL doesn’t guarantee coherence between writes and subsequent reads of read_write textures. This inserts fences before each read of a read_write texture to ensure coherency. If you’re sure you never rely on this, you can set this to false for a possible performance improvement. Note: Only Apple’s GPU compiler takes advantage of the lack of coherency, so make sure to test on Apple GPUs if you disable this.

§replace_recursive_inputs: bool

Metal 3.1 introduced a Metal regression bug which causes infinite recursion during Metal’s analysis of an entry point input structure that is itself recursive. Enabling this option will replace the recursive input declaration with a alternate variable of type void*, and then cast to the correct type at the top of the entry point function. The bug has been reported to Apple, and will hopefully be fixed in future releases.

§agx_manual_cube_grad_fixup: bool

If set, manual fixups of gradient vectors for cube texture lookups will be performed. All released Apple Silicon GPUs to date behave incorrectly when sampling a cube texture with explicit gradients. They will ignore one of the three partial derivatives based on the selected major axis, and expect the remaining derivatives to be partially transformed.

§force_fragment_with_side_effects_execution: bool

Metal will discard fragments with side effects under certain circumstances prematurely. Example: CTS test dEQP-VK.fragment_operations.early_fragment.discard_no_early_fragment_tests_depth Test will render a full screen quad with varying depth [0,1] for each fragment. Each fragment will do an operation with side effects, modify the depth value and discard the fragment. The test expects the fragment to be run due to: https://registry.khronos.org/vulkan/specs/1.0-extensions/html/vkspec.html#fragops-shader-depthreplacement which states that the fragment shader must be run due to replacing the depth in shader.

However, Metal may prematurely discards fragments without executing them (I believe this to be due to a greedy optimization on their end) making the test fail.

This option enforces fragment execution for such cases where the fragment has operations with side effects. Provided as an option hoping Metal will fix this issue in the future.