Skip to main content

Module utils

Module utils 

Source
Expand description

General utility functions and helper traits for the spider-lib framework.

This module provides a collection of miscellaneous functions and extensions that are used across different components of the spider-lib. These utilities aim to simplify common tasks such as URL manipulation, file system operations, and HTML selector parsing.

Key functionalities include:

  • Normalizing URL origins and checking same-site policies.
  • Ensuring the existence of directories for output files.
  • Conveniently converting strings into scraper::Selector instances.

Traits§

ToSelector

Functions§

create_dir
Creates a directory and all of its parent components if they are missing.
is_same_site
Checks if two URLs belong to the same site.
normalize_origin
Normalizes the origin of a request’s URL.
validate_output_dir
Validates that the parent directory of a given file path exists, creating it if necessary.