Expand description
§feedparser-rs: High-performance RSS/Atom/JSON Feed parser
A pure Rust implementation of feed parsing with API compatibility for Python’s feedparser library. Designed for 10-100x faster feed parsing with identical behavior.
§Quick Start
use feedparser_rs::parse;
let xml = r#"
<?xml version="1.0"?>
<rss version="2.0">
<channel>
<title>Example Feed</title>
<link>https://example.com</link>
<item>
<title>First Post</title>
<link>https://example.com/post/1</link>
</item>
</channel>
</rss>
"#;
let feed = parse(xml.as_bytes()).unwrap();
assert!(!feed.bozo);
assert_eq!(feed.feed.title.as_deref(), Some("Example Feed"));
assert_eq!(feed.entries.len(), 1);§Supported Formats
| Format | Versions | Detection |
|---|---|---|
| RSS | 0.90, 0.91, 0.92, 2.0 | <rss> element |
| RSS 1.0 | RDF-based | <rdf:RDF> with RSS namespace |
| Atom | 0.3, 1.0 | <feed> with Atom namespace |
| JSON Feed | 1.0, 1.1 | version field starting with https://jsonfeed.org |
§Namespace Extensions
The parser supports common feed extensions:
- iTunes/Podcast (
itunes:) - Podcast metadata, categories, explicit flags - Podcast 2.0 (
podcast:) - Transcripts, chapters, funding, persons - Dublin Core (
dc:) - Creator, date, rights, subject - Media RSS (
media:) - Thumbnails, content, descriptions - Content (
content:encoded) - Full HTML content - Syndication (
sy:) - Update frequency hints GeoRSS(georss:) - Geographic coordinates- Creative Commons (
cc:,creativeCommons:) - License information
§Type-Safe URL and MIME Handling
The library uses semantic newtypes for improved type safety:
use feedparser_rs::{Url, MimeType, Email};
// Url - wraps URL strings without validation (bozo-compatible)
let url = Url::new("https://example.com/feed.xml");
assert_eq!(url.as_str(), "https://example.com/feed.xml");
assert!(url.starts_with("https://")); // Deref to str
// MimeType - uses Arc<str> for efficient cloning
let mime = MimeType::new("application/rss+xml");
let clone = mime.clone(); // Cheap: just increments refcount
// Email - wraps email addresses
let email = Email::new("author@example.com");These types implement , so string methods work directly:Deref<Target=str>
use feedparser_rs::Url;
let url = Url::new("https://example.com/path?query=1");
assert!(url.contains("example.com"));
assert_eq!(url.len(), 32);§The Bozo Pattern
Following Python feedparser’s philosophy, this library never panics on
malformed input. Instead, it sets the bozo flag and continues parsing:
use feedparser_rs::parse;
// XML with undefined entity - triggers bozo
let xml_with_entity = b"<rss version='2.0'><channel><title>Test </title></channel></rss>";
let feed = parse(xml_with_entity).unwrap();
// Parser handles invalid characters gracefully
assert!(feed.feed.title.is_some());The bozo flag indicates the feed had issues but was still parseable.
§Resource Limits
Protect against malicious feeds with ParserLimits:
use feedparser_rs::{parse_with_limits, ParserLimits};
// Customize limits for untrusted input
let limits = ParserLimits {
max_entries: 100,
max_text_length: 50_000,
..Default::default()
};
let xml = b"<rss version='2.0'><channel><title>Safe</title></channel></rss>";
let feed = parse_with_limits(xml, limits).unwrap();§HTTP Fetching
With the http feature (enabled by default), fetch feeds from URLs:
use feedparser_rs::parse_url;
// Simple fetch
let feed = parse_url("https://example.com/feed.xml", None, None, None)?;
// With conditional GET for caching
let feed2 = parse_url(
"https://example.com/feed.xml",
feed.etag.as_deref(), // ETag from previous fetch
feed.modified.as_deref(), // Last-Modified from previous fetch
Some("MyApp/1.0"), // Custom User-Agent
)?;
if feed2.status == Some(304) {
println!("Feed not modified since last fetch");
}§Core Types
ParsedFeed- Complete parsed feed with metadata and entriesFeedMeta- Feed-level metadata (title, link, author, etc.)Entry- Individual feed entry/itemLink,Person,Tag- Common feed elementsUrl,MimeType,Email- Type-safe string wrappers
§Module Structure
Re-exports§
pub use types::Content;pub use types::Email;pub use types::Enclosure;pub use types::Entry;pub use types::FeedMeta;pub use types::FeedVersion;pub use types::Generator;pub use types::Image;pub use types::ItunesCategory;pub use types::ItunesEntryMeta;pub use types::ItunesFeedMeta;pub use types::ItunesOwner;pub use types::LimitedCollectionExt;pub use types::Link;pub use types::MediaContent;pub use types::MediaThumbnail;pub use types::MimeType;pub use types::ParsedFeed;pub use types::Person;pub use types::PodcastChapters;pub use types::PodcastEntryMeta;pub use types::PodcastFunding;pub use types::PodcastMeta;pub use types::PodcastPerson;pub use types::PodcastSoundbite;pub use types::PodcastTranscript;pub use types::PodcastValue;pub use types::PodcastValueRecipient;pub use types::Source;pub use types::Tag;pub use types::TextConstruct;pub use types::TextType;pub use types::Url;pub use types::parse_duration;pub use types::parse_explicit;pub use namespace::syndication::SyndicationMeta;pub use namespace::syndication::UpdatePeriod;pub use http::FeedHttpClient;pub use http::FeedHttpResponse;
Modules§
- compat
- Compatibility utilities for Python feedparser API Compatibility utilities for feedparser API
- http
- HTTP client module for fetching feeds from URLs
- namespace
- Namespace handlers for extended feed formats
- types
- Type definitions for feed data structures
- util
- Utility functions for feed parsing
Structs§
- Parse
Options - Parser configuration options
- Parser
Limits - Parser limits for protecting against denial-of-service attacks
Enums§
- Feed
Error - Feed parsing errors
- Limit
Error - Errors that occur when parser limits are exceeded
Functions§
- detect_
format - Auto-detect feed format from raw data
- parse
- Parse feed from raw bytes
- parse_
url - Parse feed from HTTP/HTTPS URL
- parse_
url_ with_ limits - Parse feed from URL with custom parser limits
- parse_
with_ limits - Parse feed with custom parser limits
Type Aliases§
- Result
- Result type for feed parsing operations