Skip to main content

Crate oximedia_caption_gen

Crate oximedia_caption_gen 

Source
Expand description

§oximedia-caption-gen

Advanced caption and subtitle generation for the OxiMedia Sovereign Media Framework.

This crate provides speech-to-caption alignment with frame-accurate timing, greedy and optimal (Knuth-Plass DP) line-breaking algorithms, WCAG 2.1 accessibility compliance checking, and speaker diarization metadata with crosstalk detection — all in pure Rust.

§Modules

  • alignment — Word timestamps, transcript segments, segment merging/splitting, frame alignment, and caption block construction.
  • autopunct — Deterministic auto-punctuation and sentence capitalisation.
  • burn_in — Burned-in subtitle rendering onto raw RGBA video frames using a built-in 8×12 bitmap font.
  • caption_diff — Compare two caption tracks and report differences.
  • caption_format_adapter — Serialize caption tracks to SRT/VTT/TTML.
  • caption_style_guide — Style guide rule enforcement over caption tracks.
  • caption_timing_adjuster — Shift, stretch, snap, and EDL-remap caption timestamps.
  • diarization — Speaker metadata, turn merging, per-speaker statistics, crosstalk detection, voice activity ratio, and speaker-to-caption assignment.
  • forced_narrative — Forced narrative (FN) and SDH subtitle detection and classification.
  • language_detect — Byte-trigram language detection for locale-aware line-breaking.
  • line_breaking — Greedy and optimal line-breaking, reading-speed helpers (CPS), and line-balance optimisation.
  • multi_language — Bilingual caption layout (primary + secondary language).
  • multi_language_sync — Anchor-point synchronisation of multi-language caption tracks.
  • multilang — Multi-language subtitle support with ISO 639-1 validated language codes, SRT export, and cross-language timing merge.
  • phoneme_timing — Phoneme-level timing estimation from word timestamps.
  • profanity — Configurable profanity filter for caption text.
  • punctuation_restoration — Rule-based punctuation restoration for raw ASR output.
  • reading_speed — Caption reading-speed validation (WPS-based).
  • style_generator — Font size, position, and colour suggestions based on video frame analysis.
  • style_presets — Ready-made caption style configs (Netflix, BBC, WCAG).
  • translate — Stub subtitle translation pipeline.
  • wcag — WCAG 2.1 compliance checks (1.2.2, 1.2.4, 1.2.6), reading speed validation, minimum display duration, gap detection, and compliance scoring.

Re-exports§

pub use alignment::align_to_frames;
pub use alignment::build_caption_blocks;
pub use alignment::merge_short_segments;
pub use alignment::split_long_segments;
pub use alignment::AlignmentError;
pub use alignment::CaptionBlock;
pub use alignment::CaptionPosition;
pub use alignment::TranscriptSegment;
pub use alignment::WordTimestamp;
pub use diarization::assign_speakers_to_blocks;
pub use diarization::dominant_speaker;
pub use diarization::format_speaker_label;
pub use diarization::merge_consecutive_turns;
pub use diarization::speaker_stats;
pub use diarization::voice_activity_ratio;
pub use diarization::CrosstalkDetector;
pub use diarization::DiarizationResult;
pub use diarization::Speaker;
pub use diarization::SpeakerGender;
pub use diarization::SpeakerStats;
pub use diarization::SpeakerTurn;
pub use line_breaking::compute_cps;
pub use line_breaking::greedy_break;
pub use line_breaking::optimal_break;
pub use line_breaking::reading_speed_ok;
pub use line_breaking::rebalance_lines;
pub use line_breaking::LineBalance;
pub use line_breaking::LineBreakAlgorithm;
pub use line_breaking::LineBreakConfig;
pub use wcag::check_caption_coverage;
pub use wcag::check_cps;
pub use wcag::check_live_latency;
pub use wcag::check_min_duration;
pub use wcag::check_sign_language;
pub use wcag::compliance_score;
pub use wcag::run_all_checks;
pub use wcag::WcagChecker;
pub use wcag::WcagLevel;
pub use wcag::WcagViolation;
pub use burn_in::BurnInConfig;
pub use burn_in::SubtitleBurnIn;
pub use burn_in::SubtitlePosition;
pub use multilang::CaptionEntry;
pub use multilang::LanguageCode;
pub use multilang::MultiLangCaption;
pub use multilang::MultiLangCaptionBuilder;

Modules§

alignment
Speech-to-caption alignment: word timestamps, segment merging/splitting, frame-accurate caption block construction.
autopunct
Auto-punctuation for caption text.
burn_in
Burned-in subtitle rendering onto raw RGBA video frames.
caption_diff
Caption diff: compare two caption tracks and report differences.
caption_format_adapter
Caption format adapter: serialize CaptionBlock tracks to SRT, WebVTT, and TTML output strings.
caption_style_guide
Caption style guide rule enforcement.
caption_timing_adjuster
Caption timing adjustment: shift and stretch caption timecodes to match edited or re-timed video content.
diarization
Speaker diarization metadata: speaker turns, statistics, crosstalk detection, and assigning speakers to caption blocks.
forced_narrative
Forced narrative (FN) and SDH (Subtitles for the Deaf and Hard-of-hearing) subtitle detection and classification.
language_detect
Language detection for caption transcript text.
line_breaking
Caption line-breaking algorithms: greedy, optimal (Knuth-Plass-inspired DP), reading-speed helpers, and line-balance optimisation.
multi_language
Multi-language / bilingual caption layout.
multi_language_sync
Multi-language caption synchronization.
multilang
Multi-language subtitle support with ISO 639-1 validated language codes.
phoneme_timing
Phoneme-level timing alignment for caption display.
profanity
Profanity filter for caption text.
punctuation_restoration
Punctuation restoration for raw ASR transcript output.
reading_speed
Caption reading-speed validation.
style_generator
Style suggestion engine for caption rendering.
style_presets
Caption style presets for popular broadcast standards.
translate
Subtitle translation pipeline.
wcag
WCAG 2.1 accessibility compliance checks for caption blocks.

Enums§

CaptionGenError
Errors produced by caption generation operations.

Type Aliases§

CaptionGenResult
Result alias used by caption-generation APIs.