Classic Computer Magazine Archive COMPUTE! ISSUE 1 / FALL 1979 / PAGE 29

Tokens Aren't Just for Subways — A Convenient Method to List Microsoft BASIC Tokens

Harvey B. Herman
Chemistry Department
University of North Carolina at Greensboro
Greensboro, North Carolina 27412

The latest buzzword in computer circles is "Tokens." I have even heard the verb "tokanize" used in casual conversation. However, my observation is that many people are still confused about the meaning of this term and would like to learn more. How do you explain to someone looking at the table on p. 8 of the Spring 1979 issue of the PET Gazette (list compiled by Jim Butterfield) why, for example, a decimal 161 in memory can have four or more different meanings, including the three letter BASIC key word GET? This article is intended to clear up some of the confusion (I hope) and to illustrate a convenient method to list all the tokens in various versions of Microsoft BASIC (PET, KIM, SYM, etc.).

Understanding tokens is not just an idle exercise. Useful programs have begun to appear which use "token knowledge" for specific purposes. For example, Len Lindsay (our indefatigable editor) recently published (The PET Gazette, Summer, 1979, p. 10) a program to identify PEEK and POKE in BASIC programs so they can be more easily converted to run on PETs with new ROMs. This program searches memory for the PEEK and POKE tokens and would not work unless these values are known. Other Microsoft BASICs have similar, but not identical, lists of tokens. To use the Lindsay program on other computers it probably would be necessary to change the token values. A BASIC program to list PET tokens is shown and discussed below.

128 REM 80
129 REM 81
130 REM 82
131 REM 83
132 REM 84
133 REM 85
134 REM 86
135 REM 87
136 REM 88
137 REM 89
138 REM 8A
139 REM 8B
140 REM 8C
141 REM 8D
.
.
.
168 REM A8
169 REM A9
170 REM AA
171 REM AB
172 REM AC
173 REM AD
174 REM AE
175 REM AF
176 REM B0
177 REM B1
178 REM B2
179 REM B3
180 REM B4
181 REM B5
182 REM B6
183 REM B7
184 REM B8
185 REM B9
186 REM BA
187 REM BB
188 REM BC
189 REM BD
190 REM BE
191 REM BF
192 REM C0
193 REM C1
194 REM C2
195 REM C3
196 REM C4
197 REM C5
198 REM C6
199 REM C7
200 REM C8
201 REM C9
202 REM CA
500 REM OPEN 5, 4 : CMD 5
510 FOR I = 1 TO 667 STEP 9 : REM 667 (9 * #TOKENS - 8)
520 J = J + 1
530 POKE 1028 + I, 127 + J : REM 1028 (START OF PROGRAM STORAGE + 4)
540 NEXT I
550 LIST 128 - 202 : REM 202(127 + #TOKENS)
560 REM PRINT #5 : CLOSE5
READY.

The program can be adapted to other BASICs with only few changes (underlined). Before proceeding to that discussion a few words about tokens are in order. The concept underlying tokens is not difficult to understand. Programs are not stored exactly as they are typed in. Instead of storing all the characters in the keyword PRINT, for example, PET Microsoft BASIC stores only one 8 bit character, decimal value 153. This saves storage space and speeds up execution of programs. All the tokens are greater than 127, i.e., their hexadecimal value has its most significant bit (MSB) set. The BASIC interpreter can rapidly identify the tokens by checking the MSB and jumping to the appropriate subroutine.

The number of tokens in a given BASIC depends on the number of commands and functions which have been implemented. In a recent article on tokens (MICRO 15:20) a list for OSI BASIC was included which showed 68 tokens (for comparison PET has 75). Also, the PRINT token had the decimal value of 151 (PET uses 153). These facts are cited to emphasize the importance of modifying programs which PEEK at memory for particular tokens when transferring them to other computers. The values may accidentally agree but don't count on it.

Listing (Below) With Output

142 REM 8E
143 REM 8F
144 REM 90
145 REM 91
146 REM 92
147 REM 93
148 REM 94
149 REM 95
150 REM 96
151 REM 97
152 REM 98
153 REM 99
154 REM 9A
155 REM 9B
156 REM 9C
157 REM 9D
158 REM 9E
159 REM 9F
160 REM A0
161 REM A1
162 REM A2
163 REM A3
164 REM A4
165 REM A5
166 REM A6
167 REM A7
128 END 80
129 FOR 81
130 NEXT 82
131 DATA 83
132 INPUT# 84
133 INPUT 85
134 DIM 86
135 READ 87
136 LET 88
137 GOTO 89
138 RUN 8A
139 IF 8B
140 RESTORE 8C
141 GOSUB 8D
142 RETURN 8E
143 REM 8F
144 STOP 90
145 ON 91
146 WAIT 92
147 LOAD 93
148 SAVE 94
149 VERIFY 95
150 DEF 96
151 POKE 97
152 PRINT# 98
153 PRINT 99
154 CONT 9A
155 LIST 9B
156 CLR 9C
157 CMD 9D
158 SYS 9E
159 OPEN 9F
160 CLOSE A0
161 GET A1
162 NEW A2
163 TAB ( A3
164 TO A4
165 FN A5
166 SPC ( A6
167 THEN A7
168 NOT A8
169 STEP A9
170 + AA
171 - AB
172 * AC
173 / AD
174 ^ AE
175 AND AF
176 OR B0
177 > B1
178 = B2
179 < B3
180 SGN B4
181 INT B5
182 ABS B6
183 USR B7
184 FRE B8
185 POS B9
186 SQR BA
187 RND BB
188 LOG BC
189 EXP BD
190 COS BE
191 SIN BF
192 TAN C0
193 ATN C1
194 PEEK C2
195 LEN C3
196 STR$ C4
197 VAL C5
198 ASC C6
199 CHR$ C7
200 LEFT$ C8
201 RIGHT$ C9
202 MID$ CA
READY.

The program shown is loaded and run normally. It converts the REM tokens in statements 128 to 202 (PET version) to the correspondingly numbered token and terminates with a list of the tokens and their decimal and hexadecimal equivalents. Note the program will not run a second time with a simple RUN command as the first REM has been replaced with an END (try RUN 500 instead). The PET version can be listed on a printer, if available, by deleting the REM in statement 500 and properly closing the file after the program ends.

If you are using this program on another computer (KIM or SYM) the number of tokens will need to be changed. The proper value can be found by trial and error. When the number of tokens is less an error will be printed when the list in statement 550 attempts to print an invalid token. The number of the last printed token is used to correct statement 550. The REM comments will help in locating other statements which use the number of tokens and need correction. When the number of tokens is greater than the PET, more initial REMs should be added (203 and above), and the number of tokens increased appropriately until an invalid token causes an error message as above.

Whatever computer is being used the list of tokens should be kept handy as it is an invaluable aid in understanding and modifying programs written for other systems.