Skip to main content

Module loader

Module loader 

Source
Expand description

Document loading abstraction for pluggable file format support.

The DocumentLoader trait decouples file format handling from the RAG pipeline. Built-in loaders handle text (.txt, .md) and subtitle (.srt, .vtt) formats. Third parties can implement DocumentLoader for any format.

The LoaderRegistry dispatches loading to the appropriate loader based on file extension, with support for sidecar subtitle files adjacent to media files.

§Example

use aprender_rag::loader::LoaderRegistry;
use std::path::Path;

let registry = LoaderRegistry::new();
let extensions = registry.supported_extensions();
assert!(extensions.contains(&"txt"));
assert!(extensions.contains(&"srt"));

Re-exports§

pub use transcription::TranscriptionLoader;

Modules§

transcription
Feature-gated transcription loader using whisper-apr for speech-to-text.

Structs§

ImageLoader
Loads image files by extracting text via Tesseract OCR.
LoaderRegistry
Registry that dispatches file loading to the appropriate DocumentLoader.
SubtitleLoader
Loads subtitle files and produces Documents with timestamp metadata.
TextLoader
Loads plain text and Markdown files.

Traits§

DocumentLoader
Abstraction for loading files of any format into Documents.