Skip to main content

CudaEncodeStageAccelerator

Struct CudaEncodeStageAccelerator 

Source
pub struct CudaEncodeStageAccelerator { /* private fields */ }
Expand description

CUDA implementation of selected JPEG 2000 encode stages.

Implementations§

Source§

impl CudaEncodeStageAccelerator

Source

pub fn with_profile_collection(collect_profile: bool) -> Self

Create an encode-stage accelerator with optional CUDA stage timing collection.

Source

pub fn for_auto_host_output() -> Self

Create the measured Auto route for host-output HTJ2K encode.

CUDA keeps the DWT and HT code-block stages, while forward RCT and Tier-2 packetization stay on the CPU for the current host-pixel path.

Source

pub fn prefer_cpu_forward_rct(self, prefer_cpu_forward_rct: bool) -> Self

Prefer scalar CPU forward RCT while keeping later CUDA stages enabled.

Source

pub fn prefer_cpu_packetization(self, prefer_cpu_packetization: bool) -> Self

Prefer scalar CPU Tier-2 packetization while keeping CUDA Tier-1/HT block coding enabled.

This is useful for batches of many small tiles where launching a CUDA packetization kernel and copying several tiny descriptor buffers per tile costs more than forming the packet body on the host.

Source

pub fn prefer_cpu_ht_subband(self, prefer_cpu_ht_subband: bool) -> Self

Prefer host sub-band quantization while keeping batched CUDA HT code-block encode enabled.

This avoids launching one CUDA quantize/subband path for every prepared subband in multi-resolution precomputed transcode outputs, where the many tiny launches cost more than CPU quantization.

Source

pub fn prefer_cpu_quantize_subband( self, prefer_cpu_quantize_subband: bool, ) -> Self

Prefer host sub-band quantization while keeping CUDA HT code-block encode enabled.

Multi-resolution transcode workloads can contain thousands of small subbands; for those, CPU quantization plus one batched HT code-block encode per tile is currently faster than launching CUDA quantization for every subband.

Source

pub const fn collected_stage_timings(&self) -> CudaEncodeStageTimings

Return cumulative CUDA encode stage timings collected by this accelerator.

Source

pub fn reset_collected_stage_timings(&mut self)

Clear cumulative CUDA encode stage timings without changing dispatch counters.

Source

pub fn deinterleave_attempts(&self) -> usize

Number of deinterleave attempts observed.

Source

pub fn forward_rct_attempts(&self) -> usize

Number of forward RCT attempts observed.

Source

pub fn forward_ict_attempts(&self) -> usize

Number of forward ICT attempts observed.

Source

pub fn forward_dwt53_attempts(&self) -> usize

Number of forward 5/3 DWT attempts observed.

Source

pub fn forward_dwt97_attempts(&self) -> usize

Number of forward 9/7 DWT attempts observed.

Source

pub fn quantize_subband_attempts(&self) -> usize

Number of sub-band quantization attempts observed.

Source

pub fn tier1_code_block_attempts(&self) -> usize

Number of classic Tier-1 code-block attempts observed.

Source

pub fn ht_code_block_attempts(&self) -> usize

Number of HT code-block attempts observed.

Source

pub fn packetization_attempts(&self) -> usize

Number of packetization attempts observed.

Source

pub fn deinterleave_dispatches(&self) -> usize

Number of deinterleave CUDA dispatches.

Source

pub fn forward_rct_dispatches(&self) -> usize

Number of forward RCT CUDA dispatches.

Source

pub fn forward_ict_dispatches(&self) -> usize

Number of forward ICT CUDA dispatches.

Source

pub fn forward_dwt53_dispatches(&self) -> usize

Number of forward 5/3 DWT CUDA dispatches.

Source

pub fn forward_dwt97_dispatches(&self) -> usize

Number of forward 9/7 DWT CUDA dispatches.

Source

pub fn quantize_subband_dispatches(&self) -> usize

Number of sub-band quantization CUDA dispatches.

Source

pub fn tier1_code_block_dispatches(&self) -> usize

Number of classic Tier-1 CUDA dispatches.

Source

pub fn ht_code_block_dispatches(&self) -> usize

Number of HT code-block CUDA dispatches.

Source

pub fn packetization_dispatches(&self) -> usize

Number of packetization CUDA dispatches.

Trait Implementations§

Source§

impl Clone for CudaEncodeStageAccelerator

Source§

fn clone(&self) -> CudaEncodeStageAccelerator

Returns a duplicate of the value. Read more
1.0.0 (const: unstable) · Source§

fn clone_from(&mut self, source: &Self)

Performs copy-assignment from source. Read more
Source§

impl Debug for CudaEncodeStageAccelerator

Source§

fn fmt(&self, f: &mut Formatter<'_>) -> Result

Formats the value using the given formatter. Read more
Source§

impl Default for CudaEncodeStageAccelerator

Source§

fn default() -> CudaEncodeStageAccelerator

Returns the “default value” for a type. Read more
Source§

impl J2kEncodeStageAccelerator for CudaEncodeStageAccelerator

Source§

fn dispatch_report(&self) -> J2kEncodeDispatchReport

Report cumulative backend dispatches completed by this accelerator.
Source§

fn encode_deinterleave( &mut self, job: J2kDeinterleaveToF32Job<'_>, ) -> Result<Option<Vec<Vec<f32>>>, &'static str>

Optionally deinterleave interleaved pixel bytes into f32 component planes.
Source§

fn encode_forward_rct( &mut self, job: J2kForwardRctJob<'_>, ) -> Result<bool, &'static str>

Optionally apply forward RCT in place.
Source§

fn encode_forward_ict( &mut self, job: J2kForwardIctJob<'_>, ) -> Result<bool, &'static str>

Optionally apply forward ICT in place.
Source§

fn encode_forward_dwt53( &mut self, job: J2kForwardDwt53Job<'_>, ) -> Result<Option<J2kForwardDwt53Output>, &'static str>

Optionally run a forward reversible 5/3 DWT.
Source§

fn encode_forward_dwt97( &mut self, job: J2kForwardDwt97Job<'_>, ) -> Result<Option<J2kForwardDwt97Output>, &'static str>

Optionally run a forward irreversible 9/7 DWT.
Source§

fn encode_quantize_subband( &mut self, job: J2kQuantizeSubbandJob<'_>, ) -> Result<Option<Vec<i32>>, &'static str>

Optionally quantize one sub-band.
Source§

fn encode_tier1_code_block( &mut self, _job: J2kTier1CodeBlockEncodeJob<'_>, ) -> Result<Option<EncodedJ2kCodeBlock>, &'static str>

Optionally encode one classic Tier-1 code-block.
Source§

fn encode_ht_code_block( &mut self, job: J2kHtCodeBlockEncodeJob<'_>, ) -> Result<Option<EncodedHtJ2kCodeBlock>, &'static str>

Optionally encode one HTJ2K code-block.
Source§

fn encode_ht_code_blocks( &mut self, jobs: &[J2kHtCodeBlockEncodeJob<'_>], ) -> Result<Option<Vec<EncodedHtJ2kCodeBlock>>, &'static str>

Optionally encode multiple HTJ2K code-blocks in one backend dispatch.
Source§

fn encode_htj2k_tile( &mut self, job: J2kHtj2kTileEncodeJob<'_>, ) -> Result<Option<Vec<u8>>, &'static str>

Optionally encode the complete HTJ2K tile packet body.
Source§

fn encode_ht_subband( &mut self, job: J2kHtSubbandEncodeJob<'_>, ) -> Result<Option<Vec<EncodedHtJ2kCodeBlock>>, &'static str>

Optionally quantize and encode one HTJ2K cleanup-only sub-band.
Source§

fn encode_packetization( &mut self, job: J2kPacketizationEncodeJob<'_>, ) -> Result<Option<Vec<u8>>, &'static str>

Optionally packetize prepared packet contributions.
Source§

fn encode_tier1_code_blocks( &mut self, _jobs: &[J2kTier1CodeBlockEncodeJob<'_>], ) -> Result<Option<Vec<EncodedJ2kCodeBlock>>, &'static str>

Optionally encode multiple classic Tier-1 code-blocks in one backend dispatch.
Source§

fn prefer_parallel_cpu_code_block_fallback(&self) -> bool

Return whether native CPU code-block fallback should use internal rayon parallelism.
Source§

fn prefer_parallel_cpu_tile_encode(&self) -> bool

Return whether whole-tile CPU-only batch encode may be parallelized by callers.

Auto Trait Implementations§

Blanket Implementations§

Source§

impl<T> Any for T
where T: 'static + ?Sized,

Source§

fn type_id(&self) -> TypeId

Gets the TypeId of self. Read more
Source§

impl<T> Borrow<T> for T
where T: ?Sized,

Source§

fn borrow(&self) -> &T

Immutably borrows from an owned value. Read more
Source§

impl<T> BorrowMut<T> for T
where T: ?Sized,

Source§

fn borrow_mut(&mut self) -> &mut T

Mutably borrows from an owned value. Read more
Source§

impl<T> CloneToUninit for T
where T: Clone,

Source§

unsafe fn clone_to_uninit(&self, dest: *mut u8)

🔬This is a nightly-only experimental API. (clone_to_uninit)
Performs copy-assignment from self to dest. Read more
Source§

impl<T> From<T> for T

Source§

fn from(t: T) -> T

Returns the argument unchanged.

Source§

impl<T, U> Into<U> for T
where U: From<T>,

Source§

fn into(self) -> U

Calls U::from(self).

That is, this conversion is whatever the implementation of From<T> for U chooses to do.

Source§

impl<T> IntoEither for T

Source§

fn into_either(self, into_left: bool) -> Either<Self, Self>

Converts self into a Left variant of Either<Self, Self> if into_left is true. Converts self into a Right variant of Either<Self, Self> otherwise. Read more
Source§

fn into_either_with<F>(self, into_left: F) -> Either<Self, Self>
where F: FnOnce(&Self) -> bool,

Converts self into a Left variant of Either<Self, Self> if into_left(&self) returns true. Converts self into a Right variant of Either<Self, Self> otherwise. Read more
Source§

impl<T> Pointable for T

Source§

const ALIGN: usize

The alignment of pointer.
Source§

type Init = T

The type for initializers.
Source§

unsafe fn init(init: <T as Pointable>::Init) -> usize

Initializes a with the given initializer. Read more
Source§

unsafe fn deref<'a>(ptr: usize) -> &'a T

Dereferences the given pointer. Read more
Source§

unsafe fn deref_mut<'a>(ptr: usize) -> &'a mut T

Mutably dereferences the given pointer. Read more
Source§

unsafe fn drop(ptr: usize)

Drops the object pointed to by the given pointer. Read more
Source§

impl<T, S> SimdFrom<T, S> for T
where S: Simd,

Source§

fn simd_from(_simd: S, value: T) -> T

Source§

impl<F, T, S> SimdInto<T, S> for F
where T: SimdFrom<F, S>, S: Simd,

Source§

fn simd_into(self, simd: S) -> T

Source§

impl<T> ToOwned for T
where T: Clone,

Source§

type Owned = T

The resulting type after obtaining ownership.
Source§

fn to_owned(&self) -> T

Creates owned data from borrowed data, usually by cloning. Read more
Source§

fn clone_into(&self, target: &mut T)

Uses borrowed data to replace owned data, usually by cloning. Read more
Source§

impl<T, U> TryFrom<U> for T
where U: Into<T>,

Source§

type Error = Infallible

The type returned in the event of a conversion error.
Source§

fn try_from(value: U) -> Result<T, <T as TryFrom<U>>::Error>

Performs the conversion.
Source§

impl<T, U> TryInto<U> for T
where U: TryFrom<T>,

Source§

type Error = <U as TryFrom<T>>::Error

The type returned in the event of a conversion error.
Source§

fn try_into(self) -> Result<U, <U as TryFrom<T>>::Error>

Performs the conversion.