Voice synthesizers for the Atari 400 & 800
Confidently, I slipped into the Commander's chair. I punched "Start" and a vision of deep space, scattered with stars, flashed on the viewscreen. My superior's deep voice washed through the room, "Welcome aboard, Commander. Your mission . . ." When he finished, I typed "G" for the Galaxy Map. Lt. Longri's tenor explained that a Zylon full battle patrol had entered sector A4. That fit my strategy! I punched the controls, and the ship leaped into hyperspace. Upon reentry, Captain Sumtra's dusky voice warned, "Zylon sector, sir." I punched for shields. "Shields," she replied.
The screen became a blur of ships, photon torpedoes, explosions. Lt. Longri calmly tracked our kills, while Captain Sumtra repeated every order smoothly. Suddenly, Damage Control's clipped, high-pitched voice screamed through the flight deck, "Shields lost!!" A Zylon fired at us. I punched hyperspace. The screen disolved in a flash of white. Against a dark screen, the Federation's emblem appeared, the commander spoke quietly, "Posthumous... rank awarded... Garbage Scow Captain."
Now, two machines make it easy to add voices to your Atari programs. The "TYPE 'N TALK" (TNT) from Votrax and ECHO-GP from Street Electronics synthesize, or create speech, from written English almost as easily as characters are printed on your screen.
Applications far beyond obvious game enhancements abound. Imagine pronouncing dictionaries or spelling programs more flexible than Speak-N-Spell. Either system could be set up easily to speak for a speech-impaired person, or to voice, letter-for-letter, or word-for-word, all data entered by, or sent to, a blind operator. My most successful program, so far, displays a four-color chart and explains it orally, with no text distracting from the visual. At least half the fun is watching a new user's face as the computer says, "Hello Mary!"
Both TNT and ECHO are efficient, small, speak an unlimited vocabulary (anything you can print), take almost no memory, and cost less than $500. Both speak with a distinct "computer voice" which the uninitiated can understand, with some concentration, but which quickly becomes "natural." It's a bit like getting used to that uncle with the funny accent.
Both units require an ATARI 850 Interface and a cable. The cables are available from the manufacturers for an extra $30, or can be made, as follows. Order the 9-pin DB connector from Apex (APX-90006 $5.50), and a 25 pin DB male connector from Radio Shack ($3.50) or any electronics house. Buy a few feet of any 6-conductor (or more) conductor cable (Beldon #9421 is often used). Connect these according to the chart (Fig. 1), and you've saved $20. The TNT requires an 8-ohm speaker ($5-$10) and a mini-phone jack. The ECHO has a built-in speaker but you can add an external speaker for fidelity and volume. With an external speaker, the ECHO puts out considerably more sound than the TNT.
Getting started is simple. Set the switch to 300 baud, plug the cable into serial port 1 or 2 of the 850, boot the system, and type the following statement [The "n"s represent the IOCB (see Basic manual pg 26); the "x"s are the port number, 1 or 2]:
OPEN #n,8,0,"Rx":XIO 34, #n,48,0,"Rx":XI036, #n,12,0,"Rx"
After that, merely issue PRINT #n commands to make the units speak what you wish. A program to input a string from the keyboard and speak it, takes no more than three lines. Both units include clear, usable manuals with lots of examples.
Although the units are similar, there are clear differences. The most important criterion to me was intelligibility. No speech synthesis device is worthwhile if you can't understand it. A frequent user will get accustomed to either of these. To check for immediate clarity, I took both units to the Lawrence Livermore Lab Science Fair, and asked visitors to listen to a list of 20 words, spelled as recommended by both companies, spoken alternatively on one, then the other, unit. Since I have used the Votrax for 6 months and find it quite clear, I expected it to win this test. However, nearly all people listening to the two for the first time found the ECHO clearly superior. The ECHO seemed to excel with words beginning with "hard sounds" such as T, P, B.
Intelligibility aside, I examined reactions to the ECHO's many unique features. Both units sound like computers, not people. But as one girl said, the ECHO sounds like a "he,' the TNT like an "it." The ECHO's software-switchable pitches (at normal speed) were a popular feature. The lower voices were easier for most people to understand and several suggested creating dialogues between different personalities, each with a different voice.
The ECHO's "inflection" feature raises the tone of the last syllable before a question mark and lowers it before a period. Although only about half of the new listeners could describe this effect, it may have contributed to the ECHO's superior intelligibility.
Spoken punctuation is another ECHO plus. Normally, it speaks the punctuation commonly spoken ($,#, = ). But, at the drop of a software instruction, most punctuation (comma, period, semi-colon, parenthesis, etc.) or all (including spaces, returns, etc.) are spoken. This could be a real boon to the sight-impaired. Both units will spell capitalized acronyms. The ECHO, however, has a letter mode which will spell out all words--very useful for a spelling program or a blind operator faced with an unintelligible word.
Both systems allow the user to create phoneme strings. This results in phrases with exceptional clarity. Frankly, since I get acceptable results with English, phoneme coding words seem like too much work. For instance, "catalogue" is coded "KA3DIL*1G"! If you decide to phoneme code, a TNT software option will send you a phoneme string as it translates from the English. You then polish it up for final phoneme codes.
The TNT's enclosure has some problems. The on/off switch is on the back panel, and worse, the unit has no "on" light. Many's the time the kids have left the TNT on all night! ECHO has a light, and the switch is right up front.
So there's the balance. Both do a good job. Intelligibility, features and price make the ECHO distinctly superior.
by Ken Harms
1140 Mark Ave.
Carpinteria, CA 93013
TYPE 'N TALK
500 Stephenson Highway
Troy, MI 48084
List Price--$249 + speaker
|Atari||25 Pin DB Male|
|9 pin DB|
Since this article was written, Votrax has released another voice synthesizer, the Personal Speech System. The new product lists for $395 and offers several improvements over Votrax's early Type 'n Talk. Personal Speech System has a 16K algorithm (versus 4K algorithm in TNT) which leads to 95 percent accuracy in pronunciation. It can produce music and sound effects and it has a real-time clock. In addition, the speech rate amplitude and inflection are user-programmable. And this time, the speaker is inside the unit.