Expand description
RDF parsing utilities for various formats with high-performance streaming
Stability: ✅ Stable - Core parser APIs are production-ready.
This module provides parsers for all major RDF serialization formats:
- Turtle (.ttl) - A compact, human-readable format
- N-Triples (.nt) - Line-based triple format
- TriG (.trig) - Turtle with named graphs
- N-Quads (.nq) - Line-based quad format
- RDF/XML (.rdf, .xml) - XML-based format
- JSON-LD (.jsonld) - JSON-based linked data format
§Features
- Streaming parsers - Process large files without loading into memory
- Error recovery - Continue parsing after encountering errors (optional)
- Base IRI resolution - Resolve relative IRIs against a base
- Format detection - Automatic format detection from file extensions or content
- Async support - Non-blocking I/O for high-throughput applications
§Examples
§Basic Parsing
use oxirs_core::parser::{Parser, RdfFormat};
let turtle_data = r#"
@prefix foaf: <http://xmlns.com/foaf/0.1/> .
<http://example.org/alice> foaf:name "Alice" ;
foaf:knows <http://example.org/bob> .
"#;
let parser = Parser::new(RdfFormat::Turtle);
let quads = parser.parse_str_to_quads(turtle_data)?;
println!("Parsed {} quads", quads.len());§Parsing with Configuration
ⓘ
use oxirs_core::parser::{Parser, RdfFormat, ParserConfig};
let config = ParserConfig {
base_iri: Some("http://example.org/base/".to_string()),
ignore_errors: true,
max_errors: Some(10),
};
let parser = Parser::new(RdfFormat::Turtle).with_config(config);
let quads = parser.parse_str_to_quads("<relative> <p> <o> .")?;§Format Detection
ⓘ
use oxirs_core::parser::RdfFormat;
// Detect from file extension
let format = RdfFormat::from_extension("ttl");
assert_eq!(format, Some(RdfFormat::Turtle));
// Check format capabilities
assert!(!RdfFormat::Turtle.supports_quads());
assert!(RdfFormat::TriG.supports_quads());§Streaming Large Files
ⓘ
use oxirs_core::parser::{Parser, RdfFormat};
use std::fs::File;
use std::io::BufReader;
let file = File::open("large_dataset.nt")?;
let reader = BufReader::new(file);
let parser = Parser::new(RdfFormat::NTriples);
for quad in parser.for_reader(reader) {
let quad = quad?;
// Process quad without loading entire file into memory
}§Async Parsing (with async feature)
use oxirs_core::parser::{AsyncStreamingParser, RdfFormat};
let parser = AsyncStreamingParser::new(RdfFormat::Turtle);
let mut sink = parser.parse_stream(tokio::io::stdin()).await?;
while let Some(quad) = sink.next_quad().await? {
// Process quad asynchronously
}§Performance Tips
- Use streaming - For large files, use
for_reader()to avoid loading everything into memory - Choose the right format - N-Triples/N-Quads are fastest to parse (line-based)
- Enable async - For I/O-bound workloads, async parsing provides better throughput
- Batch processing - Process multiple files in parallel using rayon
§Error Handling
Parsers can be configured to handle errors in different ways:
- Strict mode (default) - Stop on first error
- Error recovery - Collect errors and continue parsing
- Max errors - Stop after a threshold of errors
§Format Support Matrix
| Format | Triples | Quads | Prefixes | Comments | Streaming |
|---|---|---|---|---|---|
| Turtle | ✅ | ❌ | ✅ | ✅ | ✅ |
| N-Triples | ✅ | ❌ | ❌ | ✅ | ✅ |
| TriG | ✅ | ✅ | ✅ | ✅ | ✅ |
| N-Quads | ✅ | ✅ | ❌ | ✅ | ✅ |
| RDF/XML | ✅ | ❌ | ✅ | ✅ | ✅ |
| JSON-LD | ✅ | ✅ | ✅ | ❌ | ✅ |
§Related Modules
crate::serializer- Serialize RDF to various formatscrate::model- RDF data model typescrate::rdf_store- Store parsed RDF data
Structs§
- Async
Streaming Parser - Async RDF streaming parser for high-performance large file processing
- Memory
Async Sink - Memory-based async sink that collects quads
- Parse
Progress - Progress information for async parsing
- Parser
- RDF parser interface
- Parser
Config - Configuration for RDF parsing
Enums§
- RdfFormat
- RDF format enumeration
Traits§
- Async
RdfSink - Async streaming sink for writing parsed RDF data
Functions§
- detect_
format_ from_ content - Convenience function to detect RDF format from content