pub struct DenoiserConfig {
pub enabled: bool,
pub max_digit_ratio: f32,
pub strip_markdown: bool,
}Expand description
Configuration for the OCR denoiser that filters digit-heavy text.
When enabled, text sections that are predominantly numerical (e.g. mangled OCR tables) are stripped down to their alphabetical content on a line-by-line basis, or dropped entirely when no alphabetical content remains.
Fields§
§enabled: boolWhether denoising is active. Defaults to false so existing behavior is unchanged.
max_digit_ratio: f32Maximum ratio of digit characters to (digit + alphabetical) characters before a
line is considered mangled OCR output. Range: 0.0–1.0.
A value of 0.35 means that if more than 35% of the alphanumeric characters on a
line are digits, the line is treated as a mangled table row and stripped down to
its alphabetical tokens.
Defaults to 0.35.
strip_markdown: boolWhether to strip common markdown formatting boundaries (e.g. pipe | table boundaries,
dropping layout-only separator rows like |---|---|).
Currently covers GFM tables; may expand to other structural markers in the future. Semantic text is preserved.
Defaults to true.
Trait Implementations§
Source§impl Clone for DenoiserConfig
impl Clone for DenoiserConfig
Source§fn clone(&self) -> DenoiserConfig
fn clone(&self) -> DenoiserConfig
1.0.0 (const: unstable) · Source§fn clone_from(&mut self, source: &Self)
fn clone_from(&mut self, source: &Self)
source. Read moreSource§impl Debug for DenoiserConfig
impl Debug for DenoiserConfig
Auto Trait Implementations§
impl Freeze for DenoiserConfig
impl RefUnwindSafe for DenoiserConfig
impl Send for DenoiserConfig
impl Sync for DenoiserConfig
impl Unpin for DenoiserConfig
impl UnsafeUnpin for DenoiserConfig
impl UnwindSafe for DenoiserConfig
Blanket Implementations§
Source§impl<T> BorrowMut<T> for Twhere
T: ?Sized,
impl<T> BorrowMut<T> for Twhere
T: ?Sized,
Source§fn borrow_mut(&mut self) -> &mut T
fn borrow_mut(&mut self) -> &mut T
Source§impl<T> CloneToUninit for Twhere
T: Clone,
impl<T> CloneToUninit for Twhere
T: Clone,
Source§impl<T> Instrument for T
impl<T> Instrument for T
Source§fn instrument(self, span: Span) -> Instrumented<Self>
fn instrument(self, span: Span) -> Instrumented<Self>
Source§fn in_current_span(self) -> Instrumented<Self>
fn in_current_span(self) -> Instrumented<Self>
Source§impl<T> IntoEither for T
impl<T> IntoEither for T
Source§fn into_either(self, into_left: bool) -> Either<Self, Self>
fn into_either(self, into_left: bool) -> Either<Self, Self>
self into a Left variant of Either<Self, Self>
if into_left is true.
Converts self into a Right variant of Either<Self, Self>
otherwise. Read moreSource§fn into_either_with<F>(self, into_left: F) -> Either<Self, Self>
fn into_either_with<F>(self, into_left: F) -> Either<Self, Self>
self into a Left variant of Either<Self, Self>
if into_left(&self) returns true.
Converts self into a Right variant of Either<Self, Self>
otherwise. Read more