A Singing/Talking Voice For VIC And 64
Arthur B. Hunkins
The Alien Group of New York City has come up with a significant advance in microcomputer voice synthesis with Voice Box, a peripheral for the VIC and 64 that can sing as well as speak. And with Voice Box you can program vocal inflection to create voices which are expressive and lifelike with virtually unlimited nuance.
Voice Box consists of the hardware peripheral, speech synthesis software on tape or disk, and Music System software, which drives both the singing voice and three-voice music from the Commodore SID chip (available only for the 64, on disk).
Plugs Into The User Port
The Voice Box itself is a sturdy, secure, 1.5 × 3 × 4-inch black box that plugs into the User Port. It consists of a 3 × 4-inch circuit board with seven chips and assorted components, an internal 2 × 3-inch speaker (.8 watt), and two external dials. One dial regulates the volume, the other the pitch range (the higher the faster for spoken material).
Voice Box produces only the vocal sound; sounds coming from the 64 SID chip require an external amplifier and speaker.
Volume is adequate for personal or small group use, but there is no provision for external amplification or headphones.
Voice Box software is different for VIC and 64, though the documentation—which is thorough and clear—differs only in detail. Software is offered on cassette for VIC and on disk for 64.
All Phonemes Are Used
Voice Box synthesizes phonemes, and is capable of producing all 64 phonemes used by the English language. The software permits programming either in English, in phonemes, or in BASIC, using number codes for phonemes.
You can incorporate the SPEAK subroutine into your BASIC programs (2K free memory required) to permit English or phoneme speech coding. If your program leaves only about 700 bytes free, you can use the PSPEAK subroutine, that allows phoneme coding only.
The Talking Head
There are three other programs in the driving software. One is the SPEAK routine with an alien face added in character graphics with a moving mouth for vocal animation. A second program allows the user to type in words to be spoken by the face.
Most elaborate and perhaps most fascinating is a SPELL program, in which an alien professor asks you to spell words, and either congratulates or chastises you, depending on your answers.
There also is a provision for adding your own words. All you need to do is to furnish the phonetic spellings in DATA statements.
Changing The Pronunciation
Many of the spoken words provided by Voice Box are difficult to understand, even though the professor will repeat them as often as you like. But you can experiment with inflection, vowel length, and timing to have Voice Box speak the way you want. The documentation provides a number of hints on improving pronunciation.
The software normally permits speech in four pitches, to give you vocal inflections through a simple system of notated slashes. But in combination with the Music System, Voice Box has the potential for continuous, infinite inflection.
The Music System
Unfortunately, the Music System software is available only for the 64, because it uses the SID chip. I recommend it even if you don't have Voice Box, since it provides an outstanding method for programming your own SID sound arrangements.
Music System is menu-driven. From a main menu, select SYNTHESIZER SETTINGS, and a densely packed screen displays SID sound options. You use the cursor controls, and the + and - keys, to select options. After you choose the new instrumentation, press the f7 key to hear the results.
By pressing other function keys, you can record a melody. Pitch is entered in a piano-like arrangement of the upper two rows on the keyboard. After you record your melody, you can go back and edit the pitch and rhythm.
Three-Voice Digital Recorder
What Music System gives you is a three-voice digital recorder with synchronizing click track (metronome), changeable tempo independent of pitch, and the ability to vary the sound of any line. You can try out and rerecord arrangements at will. And all this uses about 90 percent of the SID chip's potential. You have the ability to program pitch, waveform (including pulse width), filter type, filter resonance, filter routing select, filter cutoff point, overall amplitude, and all ADSR parameters.
You can get a single-speed phasing by internally cycling the pulse width, and you can set the rate of sweep of the filter cutoff point during a note.
This switchable effect requires specifying a beginning and ending cutoff point. (The sweep can be triggered by any selected oscillator, as it begins a new note.)
A third option, here exercised on playback (like the rhythmic editing mentioned earlier), provides for the addition of accents for selected notes per voice. The programming techniques behind these three effects bode well for the future of SID sound synthesis.
There are a few limitations, though. There is no pitch transposition, and no microtones. Only one type of filtering can be selected at a time, there is no ring modulation, only 15 pulse-width settings are available, and the modulating capabilities of both ADSR and Oscillator 3 are not implemented.
The Singing Voice
To work with the singing voice, select LYRIC EDITOR from the main menu. Text is entered in phonemes, with slashes between the sounds to be sung to different notes. A total of nine lines of text with 77 phonemes each is permitted. As a pronunciation aid, there is a "trial" line; a series of phonemes entered here will be sung in monotone when you hit RETURN.
After text is entered, pitch is added in the same way as with the SID oscillators, using the top two keyboard rows—complete with vocal tone and text. As before, rhythm can be edited later. The voice has a fixed-rate amplitude vibrato that can be edited in later, and a programmable glissando on selected pitches. It is this variable-rate slide that can theoretically be applied to achieve subtlety of inflection in speech synthesis. You are not told how to do this, but it can be done. Perhaps Alien Group or an enterprising independent programmer will soon show us.
Disk Save Option
Several other choices are available from the main menu. One allows SAVEing to disk; both a text and a music file are stored. There is a MEDLEY option, where you can string together several selections to be played in succession. And there is a program to redraw the face. During playback of any song, you can select video of a male singer with moving mouth and eyebrows, by choosing among mouth and eyebrow shapes.
Actually, the entire screen can be changed in high-resolution, multicolor graphics mode, and you can SAVE these new faces.
Voice Box represents a substantial step forward in speech synthesis. The cost, considering software and hardware flexibility, is reasonable. With all its power and options, it is remarkably easy to use, either alone or incorporated into other programs.
(for VIC-20 or Commodore 64; tape, disk for 64 only)
Music System (disk, for 64 only)
The Alien Group
27 West 23rd St.
New York, NY 10010