Module document_parser

Expand description

Document parser types used by A3S Code’s context acquisition pipeline.

These types exist so agentic_search, agentic_parse, and session wiring can register a small set of document parsers when better context extraction is needed.

They are not intended to turn a3s-code-core into a general-purpose document processing framework.

§Architecture

Contracts: parser trait and registry live in crate::doc
Core defaults: PlainTextParser plus the internal composite parser factory live here
Built-in tools: agentic_search and agentic_parse consume this registry via ToolContext
Goal: recover better model context from non-plaintext project files

§Example

use a3s_code_core::document_parser::{DocumentParser, DocumentParserRegistry};
use std::path::Path;
use anyhow::Result;

struct PdfParser;

impl DocumentParser for PdfParser {
    fn name(&self) -> &str { "pdf" }
    fn supported_extensions(&self) -> &[&str] { &["pdf"] }
    fn parse(&self, path: &Path) -> Result<String> {
        todo!()
    }
}

let mut registry = DocumentParserRegistry::empty();
registry.register(std::sync::Arc::new(PdfParser));

Structs§

DocumentBlock
DocumentBlockLocation
DocumentConfidence
DocumentMetadata
DocumentParserRegistry
DocumentProvenance
ParsedDocument
PlainTextParser: Built-in parser for all common text, code, and config formats.

Enums§

DocumentBlockKind

Traits§

DocumentParser

Functions§

default_document_parser_registry: Build the default document parser registry using the default parser config.
document_parser_registry_with_config: Build the default document parser registry using an explicit parser config.
document_parser_registry_with_config_and_ocr: Build the default document parser registry using an explicit parser config and OCR provider.

Module document_parser

Module document_parser Copy item path

§Architecture

§Example

Structs§

Enums§

Traits§

Functions§

Module document_parser