Expand description
Feed and sitemap detection for halldyll-parser
This module handles detection and extraction of:
- RSS feeds
- Atom feeds
- Sitemap XML
- Sitemap index
- JSON Feed
Structs§
- Feed
- A web feed (RSS, Atom, or JSON)
- Feed
Info - All feeds and sitemaps found on a page
- Sitemap
- A sitemap reference
Enums§
- Feed
Type - Type of web feed
- Sitemap
Source - How sitemap was discovered
- Sitemap
Type - Type of sitemap
Constants§
- COMMON_
FEED_ PATHS - Common feed paths to check
- COMMON_
SITEMAP_ PATHS - Common sitemap paths to check
Functions§
- extract_
feed_ info - Extract all feed and sitemap information from HTML document
- generate_
feed_ urls - Generate potential feed URLs for a domain
- generate_
sitemap_ urls - Generate potential sitemap URLs for a domain
- get_
atom_ feed - Get Atom feed URL if exists
- get_
feed - Get any feed URL (prefers Atom over RSS)
- get_
rss_ feed - Get RSS feed URL if exists
- get_
sitemap - Get sitemap URL if found in document
- has_
feeds - Check if document has any feeds