Programming Power - Power BASIC Meets DOS Directories
by Tom Campbell
For a long time, I've used a utility by J.P. Garbers called LF that lists the files in a directory alphabetically by extension and then alphabetically for each extension. Reimplementing LF in PowerBASIC for this month's column was illustrative, because it highlighted in an interesting way the tradeoffs between high-level and low-level languages.
This month's program is DE.EXE, for Directory by Extension. You'll need PowerBASIC to type it in and compile it. The command line syntax is simple:
Without the optional drive and path, DE lists the files in the current directory. Otherwise, it uses the specified location. The output looks like this:
.files: FOO .BAK files: COL0691 COL0971 .BAS files: DE Ft .EXE files: DE FT .TXT files: COL0691 COL0791
You can pause the output by pressing Space or quit by pressing Esc. This lets you bail out if you typed the program name by accident or if you only need to see the first part of the listing.
Writing DE took a couple of hours. Thanks entirely to PowerBASIC features, it runs very fast. It lists a 236-file directory in one second on my 386 versus two seconds for LF. Its output, however, can't be redirected, as LF's can. On the other hand, LF doesn't let you cancel by pressing Esc or pause by pressing Space (although you can pause the output using DOS's built-in-Ctrl-S feature).
DE requires almost 30K when complied versus 478 bytes for LF. Had I chosen to write De in assembly language, it would have taken me several days, and while I doubt I'd get it as tiny as 478 bytes, it certainly wouldn't have reached the 1K mark. Conclusion? I'll take PowerBASIC any day of the week for a job like this. A decade ago, every byte of disk space counted, and BASIC wasn't available as a compiler on the PC. Today, my time is too valuable to spend writing a simple utility like DE in assembly if I can help it.
This month's column explains how to get a list of the filenames in a directory. You'll need this skill to write utilities like DE or pick list boxes for loading files. It also showcases some of PowerBASIC's special features: very fast printing to the screen, array sorting (PowerBASIC pays for itself with this feature alone), and the versatile DIR$ function.
Surprisingly, getting the names of files in a directory isn't easy to do in most versions of BASIC for the PC. Turbo Pascal handles it the best of any language I know of, and QuickBASIC requires you to employ an assembly language interface to DOS, but PowerBASIC has a handy function called DIR$ to help out. It's a highly unusual function in that its syntax is different on the first invocation than it is on subsequent invocations. The first time, it's passed the file specification as the first parameter (for example *.*, *.txt, win*.?, or foo.bar) and the attribute of additional files as the second parameter. The most common attribute is 0, for normal files. You can add files to the search by adding the following values: 2 for hidden files, 4 for system files, 8 for the volume label, and 16 for subdirectories.
DIR$ then returns as a string the name of the first file in the directory matching the file specification and attribute mask. After the first invocation, use DIR$ by itself, without the parameters, to return the rest of the matching files. Here's a simple program that lists all the files in the directory:
'First file. NextNames$ = DIR$("*.*", 0) 'Get rest. WHILE NextNames$ <>"" PRINT NextName$ 'No params. NextName$ = DIR$ WEND
Because DIR$ employs DOS functions 4Eh and 4Fh, it inherits a ridiculous limitation of these functions. There's no way to select only subdirectories, only the volume label, and so on. Any invocation will return all normal files matching the file specification in addition to those requested by the mask (the second, numeric parameter). I would much rather PowerBASIC return only files matching the attribute and file specification. Turbo Pascal's implementation suffers the same deficiency, but since the return value from its FindFirst routine (a superset of PowerBASIC's DIR$) is a compound data structure including file size, attributes, and other information in addition to the name, your program can weed out the undesirables more efficiently. As we'll see in a moment, handling subdirectories in the file specification posed a problem.
Doing What's Expected
Easily the most challenging aspect of writing DE was its handling of the optional drive and path specifications. Nothing came easy here; DE follows the syntax of DOS commands such as DIR. For example, where you have a subdirectory on the drive D called \UTILS, the command line
DE D: \ UTILS
DE D: \UTILS\*.*
The DOS Find First and Find Next functions don't make this substitution for you, and with good reason. What if there's a file using the name D: \UTILS? COMMAND.COM and most external DOS utilities resolve this ambiguity by assuming you want to look for a subdirectory, but, or course, it means that you can't search for a file that has the same name as a subdirectory.
Since Find First doesn't make this choice for you, you must first check the file specification to see if it's a subdirectory. The routine IsDir% does this for you. It's a nice little black box to have around. Just call it, passing it the name of the prospective subdirectory, and IsDir% returns a nonzero value if the name is a subdirectory, and 0 if not. The brute-force method it uses is to see if anything (file of subdirectory) matches the specification.
If there's no match, IsDir% immediately exits, returning 0. If there is a match, we still don't know if it's a file or a subdirectory, thanks to the less-than-helpful Find First. (Note here that IsDir% is one of the rare times you'll see DIR$ used only once.) We then try opening a file by that name. If that can be done, IsDir% again returns false. Otherwise, we've narrowed it down--the input does indeed represent a subdirectory.
In any case, the command line is parsed, and DIR$ is used to get the list of
filenames. A single, incredibly powerful command called ARRAY SORT does what
it would take me a couple of days to write--a machine-coded QuickSort on the
array of filenames. The filenames are upended with the extension first so
that the sort will proceed properly, in one fell swoop sorting by extension
and then alphabetically within. Files are displayed with no extension at all,
since each group's listing is preceded with the note [TABULAR DATA OMITTED]
[TABULAR DATA OMITTED]