The Beginner's Page: Making Files Work

The Beginner's Page
Making Files Work

Richard Mansfield
Assistant Editor

If there's a topic that you would like to see covered in this column, send your suggestion to: The Beginner's Page, COMPUTE! Magazine, P.O. Box 5406, Greensboro, NC, 27403.

Last month we examined files, often the most difficult aspect of programming for the new computerist. Here we'll conclude this overview by looking at some specific details about file handling.

In general, a file is a list or a collection of information which has been saved on a tape or a disk. A list of all the people you send Christmas cards to could be "memorized" on tape/disk by the computer and would then be a file.

How is this list of names and addresses "memorized" and how, at Christmas, is it used to print the addresses on envelopes? A file by itself is incapable of doing anything: it is not a program, it's just a list. A separate program is needed to create, update, and make use of files.

"Mailing Address," "File Manager," and others of this type are called data base management programs. Their primary function is to build, add to, modify, or print something out from files. They manage a collection (a base) of data.

You can, of course, write your own custom data base manager. This would amount to writing a program (or a set of programs) which would let you manipulate the list on tape or disk.

Writing a large data base management program is not an easy task — it can involve sorting, searching, and other complex programming techniques. Nonetheless, handling Christmas card lists is something the novice can accomplish and it's well worth learning. Files do, though, represent something of a challenge. Your computer's manual will contain information on the necessary punctuation and syntax for BASIC'S commands which manipulate files. However, the following brief overview might be of help.

OPEN, PRINT#, INPUT#, And CLOSE

Where a program would be stored by the simple SAVE instruction, a file is stored by a combination of OPEN, PRINT#, and CLOSE. (On the Apple, PRINT is used in a special way instead of PRINT#. We'll get to that in a minute.) Likewise, a program is just LOADed, but a file is "loaded" into the computer with OPEN, INPUT#, and CLOSE.

It is a bit more complicated with files, but the bonus is that you can do more manipulating with files, easier appending (adding to them), easier merging (making two files into one), and so on.

The command OPEN is generally used to communicate with a disk or tape drive. It's like pulling open a file cabinet drawer – once a file is OPENed, you can then get at the records inside. Here's what you would do to OPEN a file named "inventory" on a disk drive attached to a Commodore, Atari, or Apple:

Commodore

10 OPEN l,8,8,"0.INVENTORY,S,R"

The first number (1) means that this file will here-after be called #1. When you pull something out of it, you would use INPUT#1 (you can have up to ten files OPEN at one time). The second number (8) means "disk drive" (a 1 in this position would mean: open a file on the cassette drive). The second eight is a "secondary address" which allows you to give additional instructions. With disk drives, just use eight.

The "0." specifies drive zero and the "S" means sequential file. The Commodore disks can create other kinds of files: random, relative, and program files, but sequential is the simplest. Finally, the "R" means read. You will be using INPUT# to get things out of this file. (A "W" here would mean write and you would PRINT# to the file.) To make this "reading or writing" distinction for tape files, the secondary address is used: a one means write and a zero means read. (10 OPEN 1,1,0,"INVENTORY" would be the same as the example above, except it makes a cassette file.) No drive number is specified and the "S" is not necessary since cassette files can only be sequential files.)

Atari

The equivalent of the disk example above is similar in Atari BASIC:

10 OPEN#1,4,0,"D1:INVENTORY"

Here, the second number (4) stands for read (use 8 for write), the next number (0) is not used by the disk drive (but is necessary to satisfy the syntax of OPEN), and the "D1" specifies the first drive (there can be up to four drives attached, Dl through D4; D: means D1:).

Apple

To OPEN an Apple disk file, you first define a special character, D$, which holds the "control-D" character (you cannot see it). You would type:

10 D$ = "(hold down both CTRL and D keys)" (or you could use 10 D$ = CHR$(4) instead)
15 PRINT D$;"OPEN INVENTORY"

In our simple example, it isn't necessary to include drive and slot numbers with the OPEN command. Line 15, however, could indicate drive one, slot six if we add "OPEN INVENTORY, S6, D1" to the quoted information. The Apple uses the format in line 15 for its disk commands; first control-D is printed and then the desired action is described within the quotes.

Putting Something In And Taking Something Out

On the Commodore and Atari, you can put information into an OPENed file by using PRINT#. For Commodore files you could put the word "PENCILS" into the file by:

20 PRINT #1, "PENCILS"

and Atari would be:

20 PRINT #1; "PENCILS"

The Apple uses two lines:

20 PRINT D$; "WRITE INVENTORY"
21 PRINT "PENCILS"

(After line 20, the following PRINT command will be considered a print to the file, not to the screen).

Going the other way, you get something out of an OPENed file by using INPUT # in combination with a string variable to "hold" whatever comes from the file (they come back to the computer in the order they were PRINT #ed). To get the word "PENCILS" back from a Commodore file:

20 INPUT #1, A$

(Then later you could print A$ to see "PENCILS") and Atari is:

20 INPUT #1, A$

(You must have previously DIMed A$).

Again, the Apple uses two lines:

20 PRINT D$; "READ INVENTORY";
21 INPUT A$

After you are finished INPUT #ing or PRINT #ing from a file which had been OPENed as file #1, you would CLOSE 1 (PET/CBM) or CLOSE #1 (Atari) or PRINT D$; "CLOSE INVENTORY" (Apple). When you've finally CLOSEd, you are free to use that file number (#1 in these examples) for some other file, with a different name. CLOSE is essential, however. Without it you could permanently lose part or all of a file, or even damage other files.

INPUT # And PRINT # Hints

INPUT # or PRINT # work very similarly to the way INPUT and PRINT work from the keyboard and to the screen. The only catch is that PRINT # needs some special handling. It's best to give it a line all to itself:

20 PRINT #1, A$ (Commodore)
30 PRINT #1, B$
20 PRINT #1; A$ (Atari)
20 PRINT D$; "WRITE NAMEOFFILE" (Apple)
21 PRINT A$

The reason for putting PRINT # on its own line is that this is an easy way to separate items in a file: with "carriage returns." Just as 10 PRINT A$ / 20 PRINT B$ will cause B$ to be on the line below A$ on the screen (since using a new line "forces" a carriage return to take place) - a separate program line puts a carriage return symbol onto the tape or disk.

Manipulating Files

Files are usually created within loops. Here's a simple program to "write" a file to tape:

10 DATA BILL, SANDY, KATIE, LARRY
20 OPEN 1, 1, 1, "NAMES": REM (A PET/CBM TAPE FILE)
30 FOR I = 1 TO 4
40 READ A$
50 PRINT #1, A$
60 NEXT I
70 CLOSE 1

Since there are four names in this file, the loop counts up to four, READing a new A$ from the DATA line each time through. Since PRINT #1 is by itself on line 50, it will send carriage returns to the tape each time it PRINT #s, separating the names on tape. This way, there will be no question that a name should be BILLSANDY.

When this file is later read into the computer, it would be very useful to know when it ends, how big it is. There are two easy ways to do this. You could add the word "END" to the DATA line and then change line 30 to read: FOR I = 1 TO 5. Or, you could put the "count" (the number of records for this file) on the tape or disk itself, as part of the file. To do this, you would add a line; 25 PRINT #1, 4.

Here's a "reader" program which brings the records from this second type of file back into the computer, finding out first what the count is:

10 OPEN 1, 1, 0, "NAMES" : REM (A PET/CBM TAPE FILE)
20 INPUT #1, COUNT : REM THIS WAS THE FIRST ITEM ON THE FILE
30 FOR I = 1 TO COUNT
40 INPUT #1, A$
50 PRINT A$ (TO THE SCREEN)
60 NEXT I
70 CLOSE 1

If you use the "END" technique, the reader program would not use line 20 and would add line: 45 IF A$ = "END" THEN GOTO 70. Notice also that you can PRINT # and INPUT # both numeric and string (alphabetical) variables.

One final note about something which might not be immediately obvious: if you update a fiie on a PET/CBM computer, you cannot put it back on a disk with the same name. You might have read it off the disk and into memory to make some changes. But before you OPEN-PRINT#-CLOSE it back onto the disk, you must first "scratch" (remove) the original version in order to replace it with the updated one. Again, each computer has different formats for this and your manual will describe them. Some systems, the Atari for example, automatically replace files when one "comes into" the disk with the same name as one already on the disk. This scratching of unwanted files is unnecessary for tape files, they will write over the old file (if you rewind the tape).

There are numerous ways to manipulate files – this is precisely why it is a challenging programming task. The programmer has more control over what is happening, but more responsibility, too. As always, start out small by perhaps just creating a file with your name on it, reading it back into memory, and printing it on the screen. Then try ten names, a mailing list, updating records, and so on. Eventually you'll become adept at file manipulation and will find many uses for these valuable programming techniques.