Assembly Line: File Handling

ASSEMBLY LINE

FILE HANDLING PART III

BY CHARLES F JOHNSON

In the last installment, we cut off the discussion of the example program just at the point where we had constructed a source and destination filename for our copy operation. (The source code for the example program was printed in the May '89 issue.) This month we'll continue where we left off, at the lines of code which read:

Moveq	#0, d5		;Search for normal files
lea	source, a5	;Address of source filename
bsr	fsfirst		;Search for it
beq.s	get_free

Remember that the source filename is stored at a location in the program labeled as source, and the destination filename is stored at dest.

The example program uses a search attribute of zero, which means we're oly going to search for an ordinary file which is not write-protected.

Searching for a file

Now we're going to make sure that the file chosen by the user of our program actually exists, and also get some other important information about it by using the GEMDOS Fsfirst call (function number $4E). Fsfirst is a useful call; it returns just about every piece of pertinent data about a particular file, including its creation time and date, its size and its name.

But you say, "Wait, we already know the name of this file!" In the case of our current example program, this is true; but one of the real strengths of the Fsfirst call is that it can use the wildcard characters "*" and "?" to search for all the files in a given directory that match the specification. In a future column we'll be using the Fsfirst call (and its companion, Fsnext) to read a disk directory, but that, as I said, is for the future.

Getting back to our example program, we've set up a subroutine called fsfirst, which expects to be passed two parameters: the address of the file specification for which to search (which can contain the wildcard characters "*" and "?") and the search attribute. (Notice that the first "f" is not capitalized in the name of our subroutine, as it is in the name of the GEMDOS call.)

Before we use Fsfirst, we'll need to tell GEMDOS where to store all the information that will be returned from the call. The way to do this is with another GEM-DOS call, Fsetdta (function number $1A)—and this is the first thing our fsfirst subroutine does. Fsetdta must be passed two parameters: the address of a 44-byte buffer (called the DTA, or disk transfer address buffer), which will be used to store the information returned from the Fsfirst call, and the function number $1A. In the example program, the DTA buffer is given the label dta.

When we call our fsfirst subroutine, the address of the file search specification is passed in register a5, and the search attribute is passed in d5. These registers will have remained intact throughout the Fsetdta call, since trap ^#1 preserves a3-a6 and d3-d7. The search attribute word specifies the parameters shown in Figure 1.

To specify different search attributes, you simply set the bits for the attributes you wish to use. An alternative method is to add together the amounts in the "Value" column for the attributes you wish to use. The example program uses a search attribute of zero, which means we're only going to search for an ordinary file which is not write-protected.

After the Fsfirst call, d0 will be set to zero if a file matching the search specification was found. If the search failed, or some other error occurred, d0 will contain a negative number. So, after performing the GEMDOS Fsfirst call, our subroutine uses the tst.l instruction to set the N (negative) and Z (zero) flags in the condition code register according to the contents of d0, then returns with the usual rts.

Note that rts does not affect the condition codes, so when we return from the fsfirst subroutine we can simply beq (branch if equal to zero) to the code that follows. If the Z flag was not set, we print an error message and branch to our exit routine, labeled byebye.

If the Z flag was set upon returning from Fsfirst, we found a file that matches the search specification, and the DTA buffer contains all of this file's vital statistics, as shown in Figure 2.

Is there enough memory?

The next thing we need to do is find out whether the machine our program is running on has enough free RAM to read the entire source file into memory all at once. Our file copying program is designed to do its reading and writing in one pass; if there isn't enough memory to read in the whole source file at once, the example program just refuses to go any further. Come to think of it, there's a good project for the more adventurous and self-motivated among you—modifying the example program to copy a file of any size. The program would have to read and write the file in several passes. Any takers?

To find out whether there's enough memory for the copy operation to take place, we'll use yet another GEMDOS call: the much-feared and dreaded (and rightly so, as we'll explain in a moment) Malloc call.

Malloc (function number $48) is one of the most important calls in the GEMDOS library. It provides a way for ST applications to allocate memory that is protected from use by other applications. (Its companion/opposite call, Mfree, is discussed below.) Malloc expects to be passed only one parameter, a longword containing the number of bytes you wish to allocate. When you return from Malloc, d0 contains the starting address of the block of reserved memory, or zero if the amount of memory requested exceeds the amount available.

However, if you pass a parameter of –1 (instead of a reserve amount) to Malloc, the call returns the size of the largest free block of memory in d0. The example program uses this version of Malloc, then compares the result with the size of the source file, which is contained in dta + 30. If the size of free memory is smaller than the size of the source file, the blo (branch if lower than) instruction takes us to no___memory, which prints a message telling the poor user that he doesn't have enough memory to copy the file, and exits.

If there's enough free memory to load the entire source file, we use the Malloc call again. Since the size of the source file is contained in dta + 26 (after the Fsfirst call, remember?), we can use the following code to reserve the memory we need:

move.1     dta+26,-(sp)
move       #$48,-(sp)
trap       #1
addq       #6,sp

If d0 is not zero after this call, we branch over the code labeled no____memory and continue on with our program. It's highly unlikely that this Malloc will fail—after all, we just used Malloc(-1) to determine that enough free memory existed—but better safe than sorry (an obnoxious truism that often holds very true in computer programming).

The trouble with Malloc

Now it's time for a short digression from our example program to discuss the problems with the GEMDOS Malloc call, mentioned briefly above. In the original ROM TOS (Version 1.0) and the newer TOS that was shipped with the Mega ST (Version 1.2), the GEMDOS Malloc call suffers from a particularly nasty bug. If an application calls Malloc more than a certain number of times (without using a corresponding number of Mfree calls), the ST gets confused.

If your application exceeds the critical number (which is usually somewhere around 40), strange things will start to happen. You may get spurious "out of memory" errors, files may refuse to load, and eventually you'll crash or lock up. The interval before the actual crash, however, is very dangerous indeed. If you try to write data to a disk when the system is in this confused state, that disk will almost certainly be corrupted beyond repair. The only solution when these symptoms start to appear is to immediately reboot your computer; once things get messed up in this way, they stay messed up.

The reason for this disastrous bug? GEMDOS maintains a list of all the blocks of memory allocated with Malloc, and when an application uses Malloc to reserve memory, another entry is simply added to the end of the current list. Unfortunately, the buffer that holds the list of Mallocs is of a fixed size, and GEMDOS does not check to see if it's already at the end when it adds a new entry. When the critical number is exceeded, new entries will actually write over other important GEMDOS data structures, wreaking havoc with the ST's file- and memory-management systems. (By the way, the Malloc bug is caused by the same problem which is responsible for the ST's well-known "40 folder" bug.)

Advance word has it that the new TOS 1.4, which will soon be released by Atari, fixes the Malloc bug and the 40-folder bug, by allowing true dynamic sizing of the memory allocation list. In the meantime, I recommend that you try to get a copy of Atari's FOLDRXXX.PRG, a program which runs from the AUTO folder and alleviates the problem by expanding the fixed memory list buffer to any size you specify. This program is available from Atari, or on many of the popular information services such as DELPHI, GEnie and CompuServe. If you're using a hard drive, this program is a necessity.

Luckily, since our example program uses only one Malloc call, it will not run into the Malloc bug. But if you ever write a program that needs to allocate memory in several chunks, you'll need to be aware of these potential problems.

Back in the saddle again

Okay, back to the example program. After successfully allocating a block of memory to read in the source file, we store the starting address of this memory in the variable copy__buffer, with the instruction:

MOVE. 1 d0, copy_buffer

We then print a message to let the user know his computer is going to be busy for a little while, and attempt to open the source file in "read only" mode. (The GEMDOS Fopen call was discussed in the January '89 Assembly Line.) We've written a subroutine called open__file that does this; the subroutine is passed the mode in d0 and the address of the nullterminated filename in a0. Before returning, it uses the tst.l instruction to set the condition codes based on the results of the Fopen call. If the N flag is set, we use the bmi instruction to branch to the label bad__open, which prints an error message and branches to the exit code at outta__here. Otherwise, we save the file handle with the instruction:

move d0, handle

Now (at last!) we're ready to start copying the selected file. The first thing we'll do is read in the entire source file, storing it in our allocated block of memory. To do this, we'll use the GEMDOS Fread call (function number $3F). Like the Fopen call, Fread was discussed in the January '89 Assembly Line. Our subroutine, called read__file, expects to be passed two parameters: the number of bytes to read in d0, and the address of the buffer into which to read it in a0. The example program calls the read__file subroutine with the following code:

move.l       dta + 26, d0
move.l       copy_buffer,a0
bsr          read_file

The size of the source file is still contained in the DTA buffer, as returned from the Fsfirst call. So we just move the contents of dta + 26 to d0 and the contents of copy__buffer (which holds the longword address of our allocated memory) to a0. Upon returning from the read___file subroutine, d0 contains either a negative number (an error message) or the number of bytes successfully read from the file Our program moves this value to d7 temporarily, while it closes the file. This is necessary because the Fclose call alters d0. Then, after closing the file, we test d7 to see if an error occurred during Fread. If d7 contains a negative number we branch to the label bad__read, which prints an error message and branches to the exit code.

The moment you've been waiting for (almost)

At long last, we're coming to the payoff—the point where we can actually write the destination file and complete our copy program. But first (you knew there had to be one more delay, didn't you?), we have to discuss yet another bug in yet another GEMDOS call.

Up until now, all the GEMDOS file-handling calls that Assembly Line has discussed were the ones that deal with manipulating files that already exist. To create a new file, we'll use the GEMDOS Fcreate call (function number $3C). Unfortunately, there's a small but pesky bug in Fcreate, which causes it to sometimes create duplicate files (files with the same name), if the filename you're trying to create already exists. Fcreate is supposed to first delete the existing file when this occurs, but for some reason this doesn't always happen.

Therefore, before creating a new file under GEMDOS, it's a good idea to first explicitly delete any existing file with the same name. To do this, we use the Fdelete call (function number $41). Fdelete is passed only two parameters: the nullterminated filename you wish to delete and the function number itself. In the example program, we pass the address of the destination filename, located at dest. It doesn't matter if the file we're trying to delete doesn't already exist, so we ignore any errors from the Fdelete call.

The act of creation

Now we can create our destination file. The GEMDOS Fcreate call takes three parameters. First is the attribute word, which has the same format as the attribute word specified for the Fsfirst call, described above. By specifying different attributes, you can create subdirectories and "hidden" files with the Fcreate call. Our example program uses zero for the attribute, which means that the file we create will be a normal, read/write file.

The second parameter passed to Fcreate is the longword address of the null-terminated name for the newly-created file, and the third parameter is the function number itself. As with all of our file-handling calls, we've written a subroutine to handle Fcreate, called create_file. The attribute word is passed to create_file in d0 and the address of the filename in a0.

Fcreate returns either a valid file handle in d0 or a negative error number. If we return from create__file with the N flag set, we branch to the label bad_create, which, as usual, prints an error message and exits. Otherwise, we save the file handle in handle and proceed to write the destination file.

The usage of the GEMDOS Fwrite call (function number $40) is identical to the Fread call. The only difference is the function number. The subroutine write_file handles the Fread call in our example program; it is passed the number of bytes to write in d0 and the address of the buffer from which to write in a0. The code in our example program is very similar to the code used to read a file:

Move.1	dta+26, d0
Move.1	copy_buffer, a0
bsr	write_file

Upon returning from write_file, d0 will contain either the number of bytes written without error or a negative error code. Just as with the read__file call, we move the result temporarily to d7 while we close the open file. Then we use tst.l d7 to see if an error occurred during the writing of the data, and branch to the appropriate error-handling code if necessary.

It is important to use the "long" form of the tst instruction when testing the results from a file read or write call, because the number of bytes read or written can easily exceed a word value. If the amount is larger than 32,767, a tst.w instruction will see it, erroneously, as a negative number.

Our example program doesn't handle one possible error that could occur while writing to a disk: running out of space on the disk. If this happens, GEMDOS does not report any error to you. It's up to you to make sure that the number of bytes that were actually written is the same as the number you wanted to write. After checking for errors, you should compare the value returned from Fwrite with the number of bytes you tried to write. If they are not equal, chances are that you ran out of room on the destination disk. In which case you should use Fdelete to delete the resulting partial file and let the user know that there's no more room on the disk.

Give back the memory!

When we're all finished with the copy operation, we still need to tie up one loose end before we exit the program. We have to give back to the system the memory we allocated with Malloc. The way to do this is with Malloc's companion call, Mfree. Mfree takes two parameters: the longword address of the allocated memory you wish to free, and the function number, $49. The memory address must be the same as the value returned from Malloc. You can't Mfree memory that wasn't first allocated.

After the example program calls Mfree, we print a message asking the user to hit a key. When he/she does, we exit the program by calling the GEMDOS Pterm0 function.

Stuff to do

Our example file-copying program is not perfect, by any means. For one thing, it assumes that you aren't trying to copy a file to the same disk (or subdirectory). Therefore, it's useless on a single-drive system, unless you use a RAMdisk to hold temporary files. You might try modifying the example program to prompt for a disk swap after reading the destination file. (Hint: The needed subroutines to do everything you need to do are already in the program.) Another good idea might be to allow retries after disk errors. Or to allow the user to make multiple copies without rereading the source file. The file handling subroutines in the example program will be useful in future examples, so be sure to save a copy of the source code.

Next time, we'll see how to modify our file-copying program to search a directory for files that match a "wildcard" specification, and introduce the concept of GEM alert boxes. Till then, code away!

Figure 1: ATTRIBUTE WORD
Bit	Value
0	$00	Return files which have normal read/write access.
1	$01	Return files which are write-protected.
2	$02	Return "hidden" files (not visible on the desktop).
3	$04	Return "system" files (not visible on the desktop).
4	$08	Return the volume name of a disk.
5	$10	Return subdirectories.
6	$20	File has been written to and closed (also known as the "archive" bit).

Figure 2: DTA BUFFER STRUCTURE
Offset
0-20	Reserved for internal use by GEMDOS.
21	File attribute.
22-23	Time of file creation (in standard GEMDOS format).
24-25	Date of file creation (in standard GEMDOS format).
26-29	Size of file, in bytes (longword).
30-43	Filename (8-character name, 3-character extension).

Charles F. Johnson, by using some as yet undiscovered laws of nature, has managed to find the time to be both a professional musician and a professional programmer. In his musical career, he has played with such artists as Chicago, George Duke, Al Jarreau and Stanley Clarke. His programming accomplishments include Mouse-Ka-Mania, Desk Manager, ARC Shell and, along with his partner, John Eidsvoog, G + Plus and MultiDesk. He and John are the owners of CodeHead software.