orbok-extract 0.16.0

orbok document extraction pipeline: extractor trait, normalization, text/markdown/pdf/docx/html extractors (RFC-005, RFC-044)
Documentation

orbok-extract

Text extraction (RFC-005): pluggable extractors turn boundary- validated source files into normalized, location-tagged segments. Extraction output is derived data — cacheable, rebuildable, never authoritative.

RFC-044 hardening adds: resource limits (ExtractLimits), structured warnings (ExtractWarning), panic isolation (extract_safely), explicit location semantics (LocationKind), and removal of the orbok-db production dependency (chunker now produces ExtractedChunk; the pipeline layer maps to ChunkSpec).