srv3-ttml 0.1.0

Youtube-flavored TTML (SRV3) captioning format parser
Documentation
# YTT (YouTube Timed Text) Format, version 3 (SRV3)

SRV3 is a proprietary captioning format used by
YouTube for storing closed captions. It is based on the
[TTML](https://www.w3.org/TR/ttml1/) standard, but with
major modifications. The format is XML based.

The structure of a YTT file is similar to that of HTML,
but with a few key differences. As the format is designed
specifically for closed captions and not general web
content, it has a few unique elements and attributes.

The structure of a YTT file is as follows:

- Header tag (`<head>`)
  - Pens (`<pen>`)
    A pen is a variable to store a style for text. It can
    be referenced by a span to apply the style to the text, similar to
    the HTML `class` attribute. Pens are defined in the header section
    of the file, and referenced in the body inside a span as `p="<pen_id>"`.
  - Window styles (`<ws>`)
    A window style is a variable to store a style for the
    text "window" (the box that contains the text).
  - Window positions (`<wp>`)
    A window position is a variable to store the position
    of the text window on the screen, anchored to a specific corner of the
    screen, with X and Y offsets defined as `(ah, av)`.
- Body tag (`<body>`)
  - Lines (`<p>`)
    A line is a block of text that appears on the screen
    at a specific time, with a specific duration, at a specific position
    and with a specific style. The text content of the line is stored
    as the inner text of the `<p>` tag.
    - Spans (`<s>`)
      A span is a block of text with a specific style
      applied to it. Spans can be used to apply different styles to
      different parts of the text within a line. You can reference a pen
      style by using the `p="<pen_id>"` attribute on the span tag.
    - Breaks (`<br>`)
      A break is a line break within a line of text. It
      is used to split a line of text into multiple lines.
        
All the above data is then wrapped inside a `<timedtext>` tag.

## Differences from TTML

- `<tt>` is replaced by `<timedtext>`.
- Pen styles are styles for text. They are referenced by spans to be applied.
- Window styles can be used to style the text window.
- Custom window positioning using window position variables.
- Ruby text support.


## TTML Tags

> [!NOTE]
> Type definitions are in the format `name: type = description`.
> Boolean types are represented as integers 0 and 1, 0 being false and 1 being true.

### timedtext

The root element of the TTML document

### head

The header section of the YTT file. Contains definitions for pens, window styles, and window positions.

### wp

Window position.

A variable declared to store a position for a window onscreen.

#### Fields

```
id: enum = Position ID
ap: enum = Anchor point
ah: int = X location
av: int  = Y location
```

### ws

A variable to store a style for the text window.

#### Fields

```
id: enum = Style ID
ju: enum = Justification ID
pd: enum = Pitch
sd: enum = Yaw/Skew
```

### pen

A pen is a variable to store styles, it is defined by an ID
and can be referenced by span to be applied to,
similar to CSS classes.

#### Fields

```
id: int = Pen ID
fs: enum = Font style (0-7)
sz: int = Font scale
of: int = offset
b: bool = bold
i: bool = italic
u: bool = underline
fc: hex = Foreground color
fo: int = Foreground opacity
bc: hex = Background color
bo: int = Background opacity
rb: enum = Ruby text (0-5)
hg: bool = Packed text
```

### body

The body section of the YTT file. Contains the lines of text to be displayed.

### p

A line of text to be displayed on the screen.


#### Fields

```
t: int = line start (in ms at video time)
d: int = line duration (in ms)
wp: enum = position ID
ws: enum = window style ID
```

## Enum values

### AnchorPoint (`ap`)

0 - Top Left
1 - Top Center
2 - Top Right
3 - Middle Left
4 - Center
5 - Middle Right
6 - Bottom Left
7 - Bottom Center
8 - Bottom Right

### Justification (`ju`)

0 - Top Left, Middle Left, Bottom Left
1 - Top Right, Middle Right, Bottom Right
2 - Top Center, Center, Bottom Center

### Ruby text (`rb`)

0 - No ruby text
1 - Base
2 - Parentheses
4 - Before text
5 - After text 

### Font style (`fs`)

1 - Courier New, Courier, Nimbus Mono 1, Cutive Mono
2 - Times New Roman, Georgia, Cambria, PT Serif Caption
3 - Lucida Console, DejaVu Sans Mono, Monaco, Consolas, PT Mono
5 - Comic Sans MS, Impact, Handlee
6 - Monotype Corsiva, URW Chancery 1, Apple Chancery, Dancing Script
7 - Carrois Gothic SC
0 - Default system font

### Pitch (`pd`) & Yaw/Skew (`sd`)

These are used to set the pitch and yaw of the text window.

2,0 - Characters above each other, columns right to left
2,1 - Characters above each other, columns left to right
3,0 - Subtitle rotated 90° CCW, columns left to right
3,1 - Subtitle rotated 90° CCW, columns right to left