1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
//============================================================================
//
// SSSS tt lll lll
// SS SS tt ll ll
// SS tttttt eeee ll ll aaaa
// SSSS tt ee ee ll ll aa
// SS tt eeeeee ll ll aaaaa -- "An Atari 2600 VCS Emulator"
// SS SS tt ee ll ll aa aa
// SSSS ttt eeeee llll llll aaaaa
//
// Copyright (c) 1995-2007 by Bradford W. Mott and the Stella team
//
// See the file "license" for information on usage and redistribution of
// this file, and for a DISCLAIMER OF ALL WARRANTIES.
//
// $Id: SpeakJet.hxx,v 1.7 2007/01/01 18:04:50 stephena Exp $
//============================================================================
/**
Emulation of the Magnevation SpeakJet.
This is the speech synthesizer chip used in the AtariVox.
See AtariVox.hxx and .cxx for AtariVox specifics.
This class doesn't attempt 100% accurate emulation of the SpeakJet,
as the chip contains a proprietary algorithm that does some complex
modelling (in other words, it doesn't just string samples together).
For this emulation, I use a library called rsynth, which does something
similar (models the human vocal/nasal tract), but is implemented
in a totally different way. You might say I'm emulating the spirit
of the SpeakJet, not the letter :)
Implementation details:
Both rsynth and the SpeakJet take a stream of phoneme codes and produce
audio output.
My SpeakJet class accepts the SpeakJet phonemes, one at a time, and
translates them to rsynth phonemes (which are not quite one-to-one
equivalent). As each phoneme is translated, it's added to a phoneme
buffer.
Because of the way rsynth is implemented, it needs a full word's worth
of phonemes in its buffer before its speech function is called. This
means I'll only call rsynth_phones() when I receive a SpeakJet code that
indicates a pause, or end-of-word, or a control code (set parameters
or such). This will result in a slight delay (typically, games will
send one SJ code per frame).
Also due to rsynth's implementation, I have to run it in a thread. This
is because rsynth_phones() is a monolithic function that needs a string
of phonemes, and takes a while to run (for the word "testing", it takes
1/4 second on an Athlon 64 @ 1800MHz). We can't have the emulator pause
for a quarter second while this happens, so I'll call rsynth_phones()
in a separate thread, and have it fill a buffer from which our main
thread will pull as much data as it needs. A typical word will be
30-40 thousand samples, and we only need fragsize/2 samples at a time.
As always when using threads, there will be locking in play...
rsynth's output is always 16-bit samples. This class will have to
convert them to 8-bit samples before feeding them to the SDL audio
buffer.
When using the AtariVox, we'll use SDL stereo sound. The regular TIA
sound will come out the left channel, and the speech will come out
the right. This isn't ideal, but it's the easiest way to mix the two
(I don't want to add an SDL_mixer dependency). The real SpeakJet uses a
separate speaker from the 2600 (the 2600 TIA sound comes from the TV,
the SJ sound comes from a set of PC speakers), so splitting them to
the left and right channels isn't unreasonable... However, it means
no game can simultaneously use stereo sound and the AtariVox (for now,
anyway).
@author B. Watson
@version $Id: SpeakJet.hxx,v 1.7 2007/01/01 18:04:50 stephena Exp $
*/
;
;
;
;
static SDL_sem *ourInputSemaphore;
static rsynth_t *rsynth;
static darray_t rsynthSamples;
// phonemeBuffer holds *translated* phonemes (e.g. rsynth phonemes,
// not SpeakJet phonemes).
static char phonemeBuffer;
// How many bytes are in the input buffer?
static uInt16 ourInputCount;
;
// Where our output samples go.
;
// For now, just a static array of them
static SpeechBuffer outputBuffers;
static SpeechBuffer *ourCurrentWriteBuffer;
static uInt8 ourCurrentWritePosition;