Expand description
§oximedia-caption-gen
Advanced caption and subtitle generation for the OxiMedia Sovereign Media Framework.
This crate provides speech-to-caption alignment with frame-accurate timing, greedy and optimal (Knuth-Plass DP) line-breaking algorithms, WCAG 2.1 accessibility compliance checking, and speaker diarization metadata with crosstalk detection — all in pure Rust.
§Modules
alignment— Word timestamps, transcript segments, segment merging/splitting, frame alignment, and caption block construction.autopunct— Deterministic auto-punctuation and sentence capitalisation.burn_in— Burned-in subtitle rendering onto raw RGBA video frames using a built-in 8×12 bitmap font.caption_diff— Compare two caption tracks and report differences.caption_format_adapter— Serialize caption tracks to SRT/VTT/TTML.caption_style_guide— Style guide rule enforcement over caption tracks.caption_timing_adjuster— Shift, stretch, snap, and EDL-remap caption timestamps.diarization— Speaker metadata, turn merging, per-speaker statistics, crosstalk detection, voice activity ratio, and speaker-to-caption assignment.forced_narrative— Forced narrative (FN) and SDH subtitle detection and classification.language_detect— Byte-trigram language detection for locale-aware line-breaking.line_breaking— Greedy and optimal line-breaking, reading-speed helpers (CPS), and line-balance optimisation.multi_language— Bilingual caption layout (primary + secondary language).multi_language_sync— Anchor-point synchronisation of multi-language caption tracks.multilang— Multi-language subtitle support with ISO 639-1 validated language codes, SRT export, and cross-language timing merge.phoneme_timing— Phoneme-level timing estimation from word timestamps.profanity— Configurable profanity filter for caption text.punctuation_restoration— Rule-based punctuation restoration for raw ASR output.reading_speed— Caption reading-speed validation (WPS-based).style_generator— Font size, position, and colour suggestions based on video frame analysis.style_presets— Ready-made caption style configs (Netflix, BBC, WCAG).translate— Stub subtitle translation pipeline.wcag— WCAG 2.1 compliance checks (1.2.2, 1.2.4, 1.2.6), reading speed validation, minimum display duration, gap detection, and compliance scoring.
Re-exports§
pub use alignment::align_to_frames;pub use alignment::build_caption_blocks;pub use alignment::merge_short_segments;pub use alignment::split_long_segments;pub use alignment::AlignmentError;pub use alignment::CaptionBlock;pub use alignment::CaptionPosition;pub use alignment::TranscriptSegment;pub use alignment::WordTimestamp;pub use diarization::assign_speakers_to_blocks;pub use diarization::dominant_speaker;pub use diarization::format_speaker_label;pub use diarization::merge_consecutive_turns;pub use diarization::speaker_stats;pub use diarization::voice_activity_ratio;pub use diarization::CrosstalkDetector;pub use diarization::DiarizationResult;pub use diarization::Speaker;pub use diarization::SpeakerGender;pub use diarization::SpeakerStats;pub use diarization::SpeakerTurn;pub use line_breaking::compute_cps;pub use line_breaking::greedy_break;pub use line_breaking::optimal_break;pub use line_breaking::reading_speed_ok;pub use line_breaking::rebalance_lines;pub use line_breaking::LineBalance;pub use line_breaking::LineBreakAlgorithm;pub use line_breaking::LineBreakConfig;pub use wcag::check_caption_coverage;pub use wcag::check_cps;pub use wcag::check_live_latency;pub use wcag::check_min_duration;pub use wcag::check_sign_language;pub use wcag::compliance_score;pub use wcag::run_all_checks;pub use wcag::WcagChecker;pub use wcag::WcagLevel;pub use wcag::WcagViolation;pub use burn_in::BurnInConfig;pub use burn_in::SubtitleBurnIn;pub use burn_in::SubtitlePosition;pub use multilang::CaptionEntry;pub use multilang::LanguageCode;pub use multilang::MultiLangCaption;pub use multilang::MultiLangCaptionBuilder;
Modules§
- alignment
- Speech-to-caption alignment: word timestamps, segment merging/splitting, frame-accurate caption block construction.
- autopunct
- Auto-punctuation for caption text.
- burn_in
- Burned-in subtitle rendering onto raw RGBA video frames.
- caption_
diff - Caption diff: compare two caption tracks and report differences.
- caption_
format_ adapter - Caption format adapter: serialize
CaptionBlocktracks to SRT, WebVTT, and TTML output strings. - caption_
style_ guide - Caption style guide rule enforcement.
- caption_
timing_ adjuster - Caption timing adjustment: shift and stretch caption timecodes to match edited or re-timed video content.
- diarization
- Speaker diarization metadata: speaker turns, statistics, crosstalk detection, and assigning speakers to caption blocks.
- forced_
narrative - Forced narrative (FN) and SDH (Subtitles for the Deaf and Hard-of-hearing) subtitle detection and classification.
- language_
detect - Language detection for caption transcript text.
- line_
breaking - Caption line-breaking algorithms: greedy, optimal (Knuth-Plass-inspired DP), reading-speed helpers, and line-balance optimisation.
- multi_
language - Multi-language / bilingual caption layout.
- multi_
language_ sync - Multi-language caption synchronization.
- multilang
- Multi-language subtitle support with ISO 639-1 validated language codes.
- phoneme_
timing - Phoneme-level timing alignment for caption display.
- profanity
- Profanity filter for caption text.
- punctuation_
restoration - Punctuation restoration for raw ASR transcript output.
- reading_
speed - Caption reading-speed validation.
- style_
generator - Style suggestion engine for caption rendering.
- style_
presets - Caption style presets for popular broadcast standards.
- translate
- Subtitle translation pipeline.
- wcag
- WCAG 2.1 accessibility compliance checks for caption blocks.
Enums§
- Caption
GenError - Errors produced by caption generation operations.
Type Aliases§
- Caption
GenResult - Result alias used by caption-generation APIs.