ReadabilityOptions

Struct ReadabilityOptions 

Source
pub struct ReadabilityOptions {
    pub max_elems_to_parse: Option<usize>,
    pub nb_top_candidates: Option<usize>,
    pub char_threshold: Option<usize>,
    pub classes_to_preserve: Option<Vec<String>>,
    pub keep_classes: Option<bool>,
    pub disable_jsonld: Option<bool>,
    pub link_density_modifier: Option<f32>,
}
Expand description

Configuration options for content extraction.

Created with ReadabilityOptions::new and used with Readability::parse_with_options.

See also: Readability::parse for basic extraction without options.

§Examples

use readability_js::ReadabilityOptions;

// Fine-tuned for news sites
let opts = ReadabilityOptions::new()
    .char_threshold(500)        // Require more content
    .nb_top_candidates(10)      // Consider more candidates
    .keep_classes(true)         // Preserve CSS classes
    .classes_to_preserve(vec!["highlight".into(), "code".into()]);

Fields§

§max_elems_to_parse: Option<usize>§nb_top_candidates: Option<usize>§char_threshold: Option<usize>§classes_to_preserve: Option<Vec<String>>§keep_classes: Option<bool>§disable_jsonld: Option<bool>§link_density_modifier: Option<f32>

Implementations§

Source§

impl ReadabilityOptions

Source

pub fn new() -> Self

Creates a new options builder with default values.

Source

pub fn max_elems_to_parse(self, val: usize) -> Self

Set maximum number of DOM elements to parse.

Limits processing to avoid performance issues on very large documents. Default is typically around 0 (unlimited).

§Arguments
  • val - Maximum elements to process (0 = unlimited)
Source

pub fn nb_top_candidates(self, val: usize) -> Self

Set number of top content candidates to consider.

The algorithm identifies potential content containers and ranks them. Higher values may improve accuracy but reduce performance. Default is typically 5.

§Arguments
  • val - Number of candidates to consider (recommended: 5-15)
Source

pub fn char_threshold(self, val: usize) -> Self

Set minimum character threshold for readable content.

Content with fewer characters will fail the readability check. Lower values are more permissive but may include navigation/ads. Default is typically 140 characters.

§Arguments
  • val - Minimum character count (recommended: 50-500)
Source

pub fn classes_to_preserve(self, val: Vec<String>) -> Self

Specify CSS classes to preserve in the output.

By default, most CSS classes are stripped from the cleaned HTML. Use this to preserve important styling classes.

§Arguments
  • val - Vector of class names to preserve (e.g., vec!["highlight".into()])
Source

pub fn keep_classes(self, val: bool) -> Self

Whether to preserve CSS classes in the output.

When true, CSS classes are preserved in the cleaned HTML. When false (default), most classes are stripped.

§Arguments
  • val - true to preserve classes, false to strip them
Source

pub fn disable_jsonld(self, val: bool) -> Self

Disable JSON-LD metadata extraction.

JSON-LD structured data can provide additional article metadata (author, publish date, etc.). Disable this if you don’t need metadata or if it causes issues.

§Arguments
  • val - true to disable JSON-LD parsing, false to enable it

Modify the link density calculation.

Content with high link density is often navigation rather than article content. This modifier adjusts how strictly link density is evaluated. Values > 1.0 are more permissive, < 1.0 are stricter.

§Arguments
  • val - Link density modifier (recommended: 0.5-2.0, default: 1.0)

Trait Implementations§

Source§

impl Clone for ReadabilityOptions

Source§

fn clone(&self) -> ReadabilityOptions

Returns a duplicate of the value. Read more
1.0.0 · Source§

fn clone_from(&mut self, source: &Self)

Performs copy-assignment from source. Read more
Source§

impl Debug for ReadabilityOptions

Source§

fn fmt(&self, f: &mut Formatter<'_>) -> Result

Formats the value using the given formatter. Read more
Source§

impl Default for ReadabilityOptions

Source§

fn default() -> ReadabilityOptions

Returns the “default value” for a type. Read more

Auto Trait Implementations§

Blanket Implementations§

Source§

impl<T> Any for T
where T: 'static + ?Sized,

Source§

fn type_id(&self) -> TypeId

Gets the TypeId of self. Read more
Source§

impl<T> Borrow<T> for T
where T: ?Sized,

Source§

fn borrow(&self) -> &T

Immutably borrows from an owned value. Read more
Source§

impl<T> BorrowMut<T> for T
where T: ?Sized,

Source§

fn borrow_mut(&mut self) -> &mut T

Mutably borrows from an owned value. Read more
Source§

impl<T> CloneToUninit for T
where T: Clone,

Source§

unsafe fn clone_to_uninit(&self, dest: *mut u8)

🔬This is a nightly-only experimental API. (clone_to_uninit)
Performs copy-assignment from self to dest. Read more
Source§

impl<T> From<T> for T

Source§

fn from(t: T) -> T

Returns the argument unchanged.

Source§

impl<T, U> Into<U> for T
where U: From<T>,

Source§

fn into(self) -> U

Calls U::from(self).

That is, this conversion is whatever the implementation of From<T> for U chooses to do.

Source§

impl<T> ToOwned for T
where T: Clone,

Source§

type Owned = T

The resulting type after obtaining ownership.
Source§

fn to_owned(&self) -> T

Creates owned data from borrowed data, usually by cloning. Read more
Source§

fn clone_into(&self, target: &mut T)

Uses borrowed data to replace owned data, usually by cloning. Read more
Source§

impl<T, U> TryFrom<U> for T
where U: Into<T>,

Source§

type Error = Infallible

The type returned in the event of a conversion error.
Source§

fn try_from(value: U) -> Result<T, <T as TryFrom<U>>::Error>

Performs the conversion.
Source§

impl<T, U> TryInto<U> for T
where U: TryFrom<T>,

Source§

type Error = <U as TryFrom<T>>::Error

The type returned in the event of a conversion error.
Source§

fn try_into(self) -> Result<U, <U as TryFrom<T>>::Error>

Performs the conversion.
Source§

impl<T> ErasedDestructor for T
where T: 'static,

Source§

impl<T> ParallelSend for T