Classic Computer Magazine Archive ANTIC VOL. 2, NO. 8 / NOVEMBER 1983




Itís hard to believe, but a whole year has gone by since I started doing this column.  In the upcoming year I will try to cover many of the topics you have requested and to pass along some of the routines I have developed to make cassette I/O easier.  Thanks for your support over the past year.  Now lets get down to the business at hand - cassette file structures.

The following is a brief overview of cassette file structures as they apply to the ATARI 400/800/1200 home computers.  It is intended to be a general introduction to the subject, so that the terms Iíll be using in upcoming articles wonít catch you by surprise.

Your programs and data files are stored on cassette tape in a structure (format) that is determined by a small portion of your computerís Operating System (OS).  The section of the OS is called the "cassette handler."  The basic structure used to store your program is called a "file."  Most programs are stored in a single file; but, for various reasons, some programs are stored in two or more files.  This latter type of program is referred to as a "multiple" or "multi-file" program.

A file is subdivided in "records."  The first record of a file is called the "header" record.  The fileís last record is called an "end-of-file" (EOF) record.  All of the records between them are called "data" records.  Although one record follows another sequentially, records are not actually contiguous.  They are separated by blank spaces called "inter-record gaps" (IRGís).

The fundamental building blocks of a file are the data records.  All normal records, except EOF records, are data records.  The 128 bytes of a data record usually contain a small segment of your program, but even a small program requires the use of many data records.  Under normal conditions, the last record in a file is not a data record.  This position is occupied by an EOF record.

The first record in a file usually contains special instructions for the OS.  Although this record is referred to as a "header," if differs from a data record primarily in that it is the first record in the file.

In machine language files, the significance of this record is found in the first six data bytes (see Figure 1).  The first byte is not used by the computer, but the next five bytes tell the OSís "boot loader" subroutine everything it needs to know to load a machine language file.  The second byte tells the OS how many records are in the file, including the header record.  The third and fourth bytes specify the low and high memory addresses where the computer is to begin loading the file.  The fifth and sixth bytes tell the computer the low and high addresses where initialization of the program should start (the entry point of the program).

1st BYTE
 HI    6th BYTE
Figure 1 
Cassette Boot File Header

BASIC files are handled by a different portion of the OS, and as a result the BASIC header record differs from the equivalent record for a machine language file. In fact, in a BASIC file, the header can be several records long. The specific length depends on a number of things.

The first thing to consider is whether the file is tokenized. A file stored by the SAVE or CSAVE command is tokenized, and has a long, complicated header that consists of two blocks of information. The first block contains seven of the nine Page Zero pointers that BASIC uses to maintain the token file in memory. The second block consists of the programís variable name table (VNT) and variable value tables (VVTís). The header, as defined here, uses as many records as necessary to hold these two blocks. The header for a non-tokenized file, such as one stored by the LIST command, is short and uncomplicated because these files are treated like keyboard input.

An EOF record differs from other records in two ways. First, it includes a special code in the third byte that tells the computer that "this is an EOF record." Second, all of its data bytes are set to zero. BASIC does not normally make use of the EOF record, because the BASIC header tells the computer how large the file is. If, for example, you try to CLOAD a file that is too large for the amount of RAM you have left, you will almost immediately get an error 19, which is a "LOAD program too long" error.

Every normal record consists of 132 eight-bit bytes (see Figure 2). The first two bytes are used by the computer for speed control. Byte three is a special flag used to identify the type of record. The next 128 bytes are the "data bytes" that contain your program or other special information. The recordís last byte is a "checksum byte" that is used for read verification. (The noise you hear on your TV during a read or write operation is the sound of the data in a record being transferred.)

As I mentioned previously, the first two bytes of a record are used by your computer for speed control when it is loading a file. The computer measures the time required to read these two bytes and then makes a number of small adjustments to some of its OS variables. This process, called "baud rate synchronization," tells the computer when and how fast information will be coming from your tape recorder.

0 1 0 1 0 1 0 1
0 1 0 1 0 1 0 1
(For Speed Measurement)
Figure 2 
Cassette Data Record 
It is possible, however, to change the algorithm that the OS uses to control the baud rate. I have found that I can get reliable cassette operations at baud rates as high as 820 with the 410 recorder. And if you use a cassette interface adapter that allows you to use other types of recorders, you can operate at baud rates as high as 1200 without serious degradation.

If you are interested in using higher baud rates, you can either read my article on the subject next year (complete with a program listing) or you can purchase a program from VERVAN called V-COS. ( V-COS is a cassette operations utility written by Del Wong.  The program is available from IJG, Inc.  Their phone number is (714) 946-5805.)

The third byte of a record is a control code byte that usually has one of three values. A value of $FE (254) indicates that the record is an EOF record. A value of $FC (252) notes that all 128 bytes have been used. All data records, including the header, contain this particular control code except the last record in the file. The last data record can contain this code, but most of the time it does not, because it is unlikely that a program will be an exact multiple of 128 bytes in length. Usually the last data record will have a control code of $FA (250). This tells the computer that fewer than the full 128 bytes in the record actually contain data. It also notes the exact number of bytes in the record that do contain information.

The last byte in a record is the checksum byte. The computer uses the value of this byte to validate, or verify, that the preceding 128 data bytes were read properly. When your computer writes a record onto a tape, a checksum is calculated for the 128 bytes in that record. The computer adds the values of each byte in the record together. If the sum exceeds 255, it simply starts counting at zero again.

The one peculiarity of checksum computation in the ATARI is that it uses "end-around carry." This means that "one" is added to the sum every time the total reaches 255. (See Figure 3 for the actual equation used.) This checksum value is stored in the checksum byte of a record as it is being written.

When the computer reads a record from tape, it computes a new checksum for the data bytes in the record and compares the new value to the value that was previously stored in the checksum byte. If the two values match, the computer proceeds to read the next data record. If they do not agree, the computer stops the loading process and gives you an error 143.

The "inter-record gap" (IRG) between records is actually the sum of the two gaps (see Figure 4). Every record is preceded by a "pre-record write tone" (PRWT) and is followed by a "post-record gap" (PRG). When the records are linked into a file, the resulting IRG equals the sum of the PRWT and the PRG.
Checksum = [speed control byte #1] + [speed control byte #2] +
N = 128 
( sum )  [(Data byte) N + (Carry flag)] 
N = l

 Figure 3
The Checksum Equation

The basic difference between "continuous" files and "stop-start" files is the Length of this IRG.  Essentially, a continuous file has a very short IRG, while a start-stop file has a longer IRG.  The longer gap used with stop-start files provides you with enough time to stop the cassette motor, restart it, and get motor speed up to normal before beginning the next record.  These electro-mechanical  operations require far more time than the short IRGís in a continuous file.  In fact, under certain conditions the computer cannot accept data fast enough to accommodate continuous loading.  This is why there are two kinds of files.  For example, if you load data in to a program using BASIC, instead of machine language, you have problems with a continuous file.

The PRWT is not a fixed value any more than baud rate is.  It can be altered by using the appropriate program.  This is of particular interest in the case of the 20-second PRWT that inserted at the beginning of every cassette file.  I have experimented with this leader tone, and have been able to shorten it to about 10 seconds with no noticeable degradation in performance.  And the V-COS program I mentioned earlier features a built-in function that enables you to change this leader to any length you want.

Thatís all for this month.  In the coming months, I will go back over each of these topics in greater detail.  I will show you how the OS uses this information, where the proper control bytes are located in memory, and how you can control some of the more important parameters by using short machine language routines.