This month we'll start a major project: a pseudo-BASIC interpreter written in Atari BASIC. Will this be a useful product? No. First, since it is written in and interpreted by Atari BASIC, it will of necessity be much slower than even Atari BASIC. Second, it will be an extremely limited language (as we'll shortly see) and, in fact, a nonstandard language.
But suppose we could overcome the first objection (speed) and ignore the second (so what if it is nonstandard, as long as it is ours). Would it be useful then? Sure. In fact, we could even speculate on rewriting the interpreter in C/65 or assembly language and ending up with an extremely fast, presumably integer-only interpreter. Still, the language is limited, and it would have to have some major extensions added before it would be really usable.
Enough speculation. Let's proceed to the language's definition.
- The program editing scheme used will be essentially identical to that of Atari BASIC. Line numbers from 1 to some maximum will automatically be sorted and executed in order. Entering just a line number will erase any line with that number.
- Single letter variables only will be allowed. This is a major point of departure from Atari BASIC, but it makes the interpreter significantly simpler. And no string variables.
- Only the first letter of each statement name (command name) will be significant. Another big departure, and one which limits us to 26 different statements. Also note that this implies that if we use "Print," we can't use "Plot," "POKE," or "Position," etc. This also implies that you can keep programs small (and unreadable) by using single letter commands.
- No functions. Sorry, but there will be no "RND(O)", no "SIN(30)", etc. This is necessary if we are to keep the expression analyzer down to manageable proportions when it is written in Atari BASIC.
- No precedence of operators. Same excuse as number 4. This means that "3 + 4*5" will evaluate as "(3 + 4)*5" or 35. Most BASICs would see that as "3 + (4*5)" or 60. Similarly, no parentheses will be allowed.
- No provision for loading or saving programs. It would be easy to add this, and we might do so later. However, I see little point in doing so as long as the interpreter is running under Atari BASIC.
Whew! Feel restricted? Well, if you are adventuresome, you can try adding to and modifying the interpreter. It is a good exercise in logic, and you might even get good enough at it to give us a scare.
And one more thing before we get started with the heavy stuff. What do we call this thing? I haven't come up with anything better than BAIT, which is my acronym for BASIC (Almost) InTer-preter. (And which is also meant to imply that it is bait: I am fishing for innovation and interest from you, my gentle readers.)
Remember: only the first character of each statement/command name is significant, so what I am really presenting here is a list of which letters of the alphabet we are going to use. The table below lists the first letter, the mnemoic I am using, the syntax of the statement, and (in parentheses) the Atari BASIC equivalent, if indeed that BAIT statement is not the same.
A Accept <variable> (INPUT) B Begin (RUN) C Call <line-number> (GOSUB) D Display (LIST) E End F Fetch <address>, <variable> (pseudo-PEEK) G Goto <line-number> I If <expression>, <statement> L Let <variable> = <expression> N New P Print <string-literal> Print <variable> Print R Return S Store <address>, <expression> (POKE)
A few of the statements need explanation, which is given below. Also, note that line-numbers and addresses, as used in the above syntax, may always be general expressions.
"Accept" allows only a single variable per use, unlike "INPUT" which allows several variables separated by commas.
"Fetch" and "Store" are complementary statements, both with the form of Atari BASIC'S "POKE." The only difference is that "Fetch" obviously needs a variable (instead of an expression) to place the fetched (PEEKed) byte into.
"If" does not use a "THEN" keyword. Instead, any BAIT statement may follow the comma.
"Let" is a required keyword in BAIT. Actually, you may have already presumed this, since otherwise there is no way to distinguish a statement letter from a variable letter in such an assignment statement.
"Print" allows only one item to be printed per statement. Not shown in the above syntax, but allowed by BAIT, are the trailing semicolons or trailing commas, which have the same meaning as under Atari BASIC.
A discussion of what constitutes a valid expression, as well as several other more esoteric points, will have to wait for following month(s).
Since the code for BAIT will be presented in pieces over the course of several months, we must start with a coherent scheme. Also, since we will not reprint this month's code next month (for example), the listings must merge properly and neatly.
To this end, I have designated several line number ranges for specific purposes, as listed below.
|1000-1999||Initialization of variables used as constants; dimension of strings and arrays; etc.|
|2000-2999||The "ready" prompt. Get a line of program/command. Parse line for line number.|
|3000-3999||Program editing. Delete and insert lines.|
|4000-4999||Control execution of running program. Execute next line, execute command line, etc.|
|5000-5999||Major subroutine which evaluates arbitrary arithmetic expressions by executing them.|
|8000-9999||Various miscellaneous subroutines, used by one or more statements.|
|10000 up||Execution of the actual statements and commands of BAIT. Line numbers of execution routine for each statement are defined in initialization segment, above.|
Sidelight: What are the major differences between this scheme and that actually used by the authors of Atari BASIC? (1) There is no provision for generalized I/O routines. (2) Atari BASIC checks the syntax of each line as it is entered and tokenizes it into internal form right then and there. BAIT simply stores exactly what you type in. (3) BAIT is missing many, many of BASIC'S capabilities, as noted above.
This Month's Listing
This program is my offering for this month. It consists primarily of the program editor, including the initialization need thereby.
One note about some temporary code: In the finished BAIT, lines 4000 through 4999 will control which statement/command will be executed next. In the case of a command (direct statement, in Atari parlance), these lines will pass control back to the ready prompt when the particular command executor returns. For program editing, we really only need one command, "Display" (LIST), so we have provided a very simple execute control which assumes that all direct statements are a request for "Display."
And now for some commentary on the code. Each section of comment is preceded by the line number (or range of numbers) that it refers to.
1010. I chose a practical number here. The larger MAXLINE is, the slower the line deletion process, and the larger the memory you will need. But feel free to change it.
1020. BUFFER$ is used to hold the program you type in and can be almost any size, but be careful: I have not put any provisions in the current BAIT code for detecting when you run out of space.
1030. This is a departure from Atari BASIC (and an effective, though memory-consuming one). Rather than scanning through the program space (BUFFER$) for a line, we "know" where it is via a table kept in LINES.
2360. Since I can't suppress the question mark which the INPUT on line 2300 produces, it is possible that using the Atari cursor keys will sometimes cause the "?" prompt to appear at the beginning of an input line. This gets rid of it by moving the right hand part of the string to the left. (It really works! Try it. And it's also used in line 2720.)
2520 and 2630. Remember, a completed FOR/NEXT loop exits with the loop variable already changed to the first failing value (thus LL + 1 in this example).
2710. If we don't do this, and if LP is greater than LL (i.e., if there is nothing following the line number), then the reference to LINE$(LP) in line 2720 gives us a string length error.
3020. Necessary, if we stripped off the line number.
3040. Shame on you. You typed in a line number with a decimal point, trying to fool me. Gotcha.
3060. The only error message in this month's code.
3110. If the line doesn't yet exist, we can't delete it.
3120, 3130. The number stored in the "LINES" table is the length of the line as stored in "BUFFER$" added to 1000 times its starting position in "BUFFER$". We could have used two arrays (one for starting position and one for length) to make it neater, but it would have used a lot more memory.
3140. This line might not work, thanks to a bug in Atari BASIC. Perhaps next month we will have a fix to work around the bug. In the meantime, small programs in BAIT will always work. (Same as the my-system-went-away-when-I-deleted-a-line problem in Atari BASIC.)
3160-3180. This is tricky. After you remove a line via 3140, the starting position of all lines above it in the buffer must be adjusted downward by the size of the line deleted. Can you follow line 3170? Remember, "START" and "LENGTH" refer to the former start and length of the deleted line.
3210. In case we typed in just a line number.
3220-3240. Notice that each new line overlays the "*" which we tack onto the end of the buffer. We then have to put the "*" back on the end. This insures that line 3140 will always work properly, even when we delete the last line in the buffer.
3250. See the comments about lines 3120 and 3130.
3310. If it wasn't a direct line, assume it was added to the program and go after another line.
10100-10150. We check all possible line numbers to see if they need to be listed. Note the similarity between this code and the code needed to delete a line (lines 3110 through 3130): in both cases we need the starting position and length of the line.
10190. Note how each statement will simply RETURN to the execute control code.
Still with me? Go try it. Type it in very carefully, backing yourself up every 20 lines or so. If it doesn't work, go back and examine what you typed in, because I guarantee that it worked just seconds before I made this listing for COMPUTE!.
Next month, we will try our hand at adding Execute Expression (the most complicated part of what is left) and Print (so we can verify that expressions are executing).
1000 REM ..INITIALIZATION.. 1001 REM .................. 1010 MAXLINE = 99 1020 DIM BUFFER$ (5000), LINE$ (128) 1030 DIM LINES (MAXLINE) 1040 FOR LP = 0 TO MAXLINE : LINES (LP) = 0 : NEXT LP 1050 BUFFER$ = "*" 1500 REM LINE NUMBERS OF EXECUTION ROUTINES 1510 PROMPT = 2100 : INNEXT = 2300 1550 DODISPLAY = 10100 2000 REM ..INTERACTION.. 2001 REM ............... 2100 PRINT "READY" 2300 INPUT LINE$ 2350 IF LEN (LINE$) = 0 THEN GOTO INNEXT 2360 IF LINE$ (1, 1) = "?" THEN LINE$ = LINE$ (1) : GOTO 2350 2370 LL = LEN (LINE$) 2500 REM CHECK FOR LINE NUMBER 2510 FOR LP = 1 TO LL 2520 IF LINE$ (LP, LP) <= "9" AND LINE$ (LP, LP) >= "0" THEN NEXT LP 2550 REM LP HAS POSITION OF FIRST NON-NUMER IC CHARACTER 2560 CURLINE = 0 2570 IF LP > 1 THEN CURLINE = VAL (LINE$ (1, LP - 1)) 2600 REM NOW SKIP LEADING SPACES, IF ANY 2610 IF LP > LL THEN 2700 2620 FOR LP = LP TO LL 2630 IF LINE$ (LP, LP) = " " THEN NEXT LP 2699 REM 2700 REM REMOVE LINE NUMBER AND LEADING SPA CES 2710 IF LP > LL THEN LINE$ = "" : GOTO 3000 2720 LINE$ = LINE$ (LP) 3000 REM ..EDITING.. 3001 REM ........... 3010 REM IF HERE, LINE NUMBER IS IN CURLINE 3020 LL = LEN (LINE$) : REM AND LL IS LENGTH THE REOF 3030 IF CURLINE = 0 AND LL = 0 THEN GOTO PROMPT 3040 IF CURLINE <> INT (CURLINE) THEN 3060 3050 IF CURLINE <= MAXLINE THEN 3100 3060 PRINT "***BAD LINE NUMBER***" 3070 GOTO PROMPT 3100 REM FIRST, DELETE CURLINE IF IT ALREADY EXISTS 3110 LENGTH = LINES (CURLINE) : IF LENGTH = 0 THEN 3200 3120 START = INT (LENGTH/1000) 3130 LENGTH = LENGTH - 1000 * START 3140 BUFFER$ (START) = BUFFER$ (START + LENGTH) 3150 LINES (CURLINE) = 0 3160 FOR LP = 1 TO MAXLINE : TEMP = LINES (LP) 3170 IF TEMP >= START * 1000 THEN LINES (LP) = TEMP-LENGTH * 1000 3180 NEXT LP 3200 REM NOW ADD LINE TO END OF BUFFER 3210 IF LL = 0 THEN GOTO INNEXT 3220 START = LEN (BUFFER$) 3230 BUFFER$ (START) = LINE$ 3240 BUFFER$ (LEN (BUFFER$) + 1) = "*" 3250 LINES (CURLINE) = START * 1000 + LL 3300 REM NOW LINE IS IN BUFFER...WHAT DO WE DO 3310 IF CURLINE THEN GOTO INNEXT 3320 REM **** TEMPORARY : JUST FALL THROUGH ~ TO 4000 **** 4000 REM ..EXECUTE CONTROL.. 4001 REM ................... 4010 GOSUB DODISPLAY 4020 BUFFER$ (INT (LINES(0)/1000)) = "*" 4030 LINES (0) = 0 4040 GOTO PROMPT 4050 REM **** 4010 THRU 4050 ARE TEMPORARY ~ **** 5000 REM ..EXECUTE EXPRESSION.. 5001 REM ...................... 8000 REM ..MISCELLANEOUS SUBROUTINES.. 8001 REM ............................. 10000 REM ..EXECUTE THE VARIOUS STATEMENTS.. 10001 REM .................................. 10100 REM == EXECUTE DISPLAY == 10110 FOR LP = 1 TO MAXLINE 10120 LENGTH = LINES (LP) : IF LENGTH = 0 THEN 10150 10130 START = INT (LENGTH/1000) : LENGTH = LENGTH - 1000 * START 10140 PRINT LP;" ";BUFFER$ (START, START + LENGTH - 1) 10150 NEXT LP 10190 RETURN