Expand description
Document Parser Extension Point
DocumentParser is a core extension point that allows users to extend
agentic tools (agentic_search, agentic_parse, etc.) with custom file format
support for binary and structured formats such as PDF, Excel, Word, etc.
§Architecture
- Core:
DocumentParsertrait +DocumentParserRegistrylive here - Default:
PlainTextParsercovers all common text/code formats - Plugins: agentic-search and agentic-parse use this registry via
ToolContext - Custom: Users register additional parsers via
SessionOptions
§Example
use a3s_code_core::document_parser::{DocumentParser, DocumentParserRegistry};
use std::path::Path;
use anyhow::Result;
struct PdfParser;
impl DocumentParser for PdfParser {
fn name(&self) -> &str { "pdf" }
fn supported_extensions(&self) -> &[&str] { &["pdf"] }
fn parse(&self, path: &Path) -> Result<String> {
// e.g. pdf_extract::extract_text(path)
todo!()
}
}
let mut registry = DocumentParserRegistry::new();
registry.register(std::sync::Arc::new(PdfParser));Structs§
- Document
Parser Registry - Registry that maps file extensions to
DocumentParserimplementations. - Plain
Text Parser - Built-in parser for all common text, code, and config formats.
Traits§
- Document
Parser - Extension point for custom file format parsing.