Random Access DATA Statements
Robert Jacques Beck
By adding this short routine to your programs, you can gain random access to any piece of information stored in DATA statements—a powerful and useful technique. It works on all Apple ll-series computers with either DOS 3.3 or ProDOS.
Any byte in Random Access Memory (RAM) can be immediately accessed during a read or write by specifying its address. Random access data files offer the same type of quick access: You locate records by specifying record numbers. Records may be retrieved in any order.
Serial, or sequential, access is based on the principle of starting with the first record and counting up to the one you want. Sequential access is usually slower than random access. While it takes approximately the same amount of time to read any record in an Applesoft random access file, the time required to read an identical record in a sequential file increases as the record is placed towards the file's end. This is because DOS must traverse each record in the file to count end-of-record marks until it locates the record it is searching for.
DATA statements in BASIC provide an in-memory sequential access file. You begin by reading the first DATA statement, and you move sequentially through the data list with each successive READ.
Until I figured out the technique described in this article, I'd always been annoyed at the rigidity of DATA statements. They're fine if you want to access your data the same way your DATA statements are organized, but they are difficult to use any other way within the confines of BASIC. Some BASICs use the RESTORE command to reset a pointer to the beginning of the data, but that's not where you always want to go. A few BASICS, such as Atari BASIC, let you RESTORE to a specific line number or even a variable, providing much more flexibility. But many BASICs (including Applesoft) lack this feature.
You can get flexibility by reading all your DATA statements into arrays and using an index to grab array elements. But storing variables as data and as arrays can be costly in terms of memory. Another approach is to read through the data each time until you get to the element you want, using code such as this:
10 RESTORE 20 FOR I = 1 TO N 30 READ INFO 40 NEXT I
After these lines have been executed, the variable INFO is equal to the Nth data element. The major disadvantage of this method is its slowness.
Fortunately, there are a couple of zero-page pointers that let us manipulate the READ operation. The two short programs included here illustrate how to pull variables directly out of DATA statements as if they were in random access files.
In the Apple, decimal locations 123 and 124 (hexadecimal $7B and $7C) store the line number of the last DATA statement read. Locations 125 and 126 point to the data's absolute memory location. The pointers are stored in the usual Apple fashion; that is, the first memory location is the low byte (lower two hexadecimal digits) and the second memory location is the high byte (upper two hexadecimal digits). To translate the information in the pointers into a line number that makes some sense, use this formula:
LN = PEEK(123) + PEEK (124) * 256
It may seem strange that the upper two digits are both multiplied by 256 when you convert to decimal. After all, while one of the digits is the 256 digit, the other is the 4096 digit (just as the third and fourth digits in a decimal number represent hundreds and thousands). But since Applesoft multiplies a byte's upper digit by 16 when you PEEK, and since 4096 = 16*256, you don't have to convert each digit separately.
Back to the pointers. Unfortunately, you can't use the line number pointer to do anything. It's just a tag-along to the memory pointer: To move from one data location to another, that's the pointer you'll need to adjust. There are a couple of ways to go about it.
Program 1 prints a memory location table of all the stuff in your DATA statements. Lines 60000 and 60010 print the table's heading. Line 60015 stops the program after the last of the DATA statements are read; line 60020 reads the DATA one variable at a time. Line 60030 calculates the pointer location just after a READ and line 60040 calculates the current line number. Line 60050 checks to see if the current line number is the same one which was just read—if it isn't, the position index (I = a variable's position within a DATA statement) is initialized. Line 60060 prints the table, one row at a time. Just tack these lines onto your program, anywhere after the last DATA statement. If you use the line numbers from Program 1 (60000–60090), then RUN 60000 to get your table.
Program 2 is a whimsical little program that shows one way to use the information from Program 1. Lines 70 to 100 read and print a list of three languages in English. Line 50 reads some memory locations into the array ML. These memory locations were obtained from Table 1. Pick which language you want the list printed in next. Line 115 sets the variable LOC to the memory location of the appropriate DATA statement. Lines 120 and 130 break the memory location into high and low bytes, then lines 140 and 150 reset the pointer so the list will be read from the correct DATA statement.
No matter how many times you cycle through the program, you'll always get the list printed in the language you want, and you'll never get an END OF DATA message.
The table is what Program 1 does when attached to Program 2. Since the locations are calculated after a READ, to locate a variable use the value from the immediately preceding variable.
An alternate method is to add or subtract the difference between the pointer's current value and new value it must have in order to point to a variable. Try these changes in Program 2:
10 DATA ENGLISH, -32, SPANISH, 0, FRENCH, 38 20 DATA INGLES, -75,ESPANOL, -38, FRANCES, 0 30 DATA ANGLAIS, -114,ESPAGNOL, -82,FRANCAIS, -39 40 REM 50 REM 80 READ A$ (I), ML(I) 115 LOC = ML (W) + PEEK (125) + PEEK (126) * 256
Line 80 now reads not only the variable, but also a number that is added to the pointer in line 115. The advantage here is that we're relying on the separation between variables, rather than their actual memory locations.
Insert the three DATA statements into the program anywhere you wish. As long as you don't change the relative position of any data, you can edit the program without affecting how the data is handled.
Generated By Combining Program 1 and Program 2
60000 PRINT "LINE #" SPC ( 3) "POSITION" SPC ( 3) "LOCATION" SPC ( 3) ldquo;VARIABLE 60010 FOR I = 1 TO 40 : PRINT "_": NEXT: PRINT 60015 ONERR GOTO 60090 60020 READ A$ 60030 LOC = PEEK (125) + PEEK (126) * 256 60040 NL = PEEK (123) + PEEK (124) * 256 60050 IF NL < > LN THEN I = 1 : LN = NL 60060 PRINT NL SPC( 10 - LEN ( STR$ ( LN))) I SPC( 10 - LEN ( STR$ (I))) LOC SPCC 11 - LEN ( STR$ (LOC)))A $ 60070 1 = 1 + 1 60080 GOTO 60020 60090 ENDProgram 2: Random Access DATA — Demonstration
10 DATA 2068, 2096, 2124 20 DATA ENGLISH, SPANSIH, FRENCH 30 DATA INGLES, ESPANOL, FRANCES 40 DATA ANGLAIS, ESPAGNOL, FRANCAIS 50 READ ML (1), ML (2), ML (3) 60 HOME 70 FOR 1 = 1 TO 3 80 READ A$(1) 90 PRINT 1 SPC( 3) A$(1): PRINT 100 NEXT 110 INPUT "WHICH ONE?" : W 120 HB = INT (LOC / 256) 130 LB = LOC - HB * 256 140 POKE 125, LB 150 POKE 126, HB 160 GOTO 60