The pitch contour is defined as a set of white space-separated targets at specified time positions in the speech output.
The algorithm for interpolating between the targets is processor-specific. In each pair of the form (time position,target),
the first value is a percentage of the period of the contained text (a number followed by “%”) and the second value is
the value of the pitch attribute (a number followed by “Hz”, a relative change, or a label value). Time position values
outside 0% to 100% are ignored. If a pitch value is not defined for 0% or 100% then the nearest pitch target is copied.
All relative values for the pitch are relative to the pitch value just before the contained text.