Classic Computer Magazine Archive ST-Log ISSUE 18 / APRIL 1988  / PAGE 61

UTILITY

ALL RESOLUTIONS

ChkDsk

A SPECIAL INCLUSION

Try DOS-level repair for those trouble spots.

by Dan Moore and David Small

David Small wrote the article ChkDsk. A long-time computer enthusiast, he has a computer science degree from Colorado State University. His achievements include books, articles and the Magic Sac.

ChkDsk brings you a program that's a powerful tool for exploring and repairing floppy/hard disks at the DOS level. It takes most of the work out of getting to know FATs and directories.

This article is a brief summary on the program's use. The program itself was too long for inclusion in these pages; it can be found on this issue's disk and on the ANALOG Publishing Atari SIG on Delphi.

How it works.

You run the program CHK.TTP by double clicking on it. You'll then be asked for options. You must at least tell it which drive to check over. For example: at the prompt, enter c: and hit RETURN to test drive C.

You can specify several options with -option (a hypen and the option). Only one option at a time, please. Example: at the prompt, enter: c: -v and press RETURN.

Note: don't be worried by the techo-talk in this summary. All these terms are explained later.

The options, which are also explained in detail later in this article, are summarized below. (You can get a summary when running the program, by pressing RETURN at the prompt.)

-f — fix disk. Does a normal check, then converts lost cluster chains to files, which you may then delete.

-m — map disk. Does a normal check, then produces a map of the disk clusters, used/empty.

-s — statistics. Just gives you the basic statistics of the disk (size, and so forth). Does not do a check of the disk.

-t — talkative. Gives the drive's statistics and does do a normal check.

-v — verbose. Does a normal check and displays each file's cluster chain.

-(number) — Does a normal check and tells the name of any file using that cluster. Note: the "number" is hexadecimal, and must have a 0 in front of it if it begins with a letter from a through f (since those letters are options).

Introducing ChkDsk.

Most people have no problems formatting, copying or deleting files, once shown how. But let's face it. . .They really don't like learning the Innermost Dark Secrets of the disk: the File Allocation Table (FAT), the directory, and, dread of dreads, the boot block.

Why? Most people find FATs and directories pretty intimidating.

And for good reason. They are complex, contain lots of hexadecimal entries, and are meant for machines to use, not people. They don't make good reading material.

What's needed is a translator: a program that reads all this machine-coded information and gives us a nice, English display of just what's going on in the disk.

I'll bet you've already guessed what ChkDsk is about—a nice, easy way to see what's really happening on your disk, be it floppy or hard.

But the CHK program is more. It is a partial disk repair tool. You're going to want to team it with a good sector editor (of which there are many, some public domain, some commercial). If you run into bad disk troubles, CHK can set your disk right without the usual drop-dead fix, the dreaded reformat.

And, finally, CHK gives you confidence in your data from one day to the next. It discovers hidden problems when they happen. Run it once per day, and you'll have a lot more reassurance that your system is working right and your data is intact.

Let's dive in and learn how to use it.

The directory and FAT.

How does the Disk Operating System (DOS) figure out where a file is on the disk? With two tables, called the directory and FAT.

They really aren't that complex; let's find out a little about them. By the way, if you want to learn every last bit about the directory, any good IBM PC reference (such as Peter Norton's) will tell all. This is meant more as an overview, to get you oriented.

Each disk contains 80 tracks, with 9 sectors per track, and 512 bytes per sector. Hence, 360K per disk side (720K double sided). Now, let's organize those 720/1440 sectors into "clusters." A "cluster" is just a collection of sectors, grouped together for convenience; in fact, as of now, just forget about the "sectors" part of it. On the ST, each cluster is 2 sectors long, or 1K. (On other systems, notably IBM, this varies.) We have either 360 or 720 clusters per disk, each one composed of 1,024 bytes.

Now, remove a few for the directory and FAT, and we have the rest dedicated to storage. And yes, remove one cluster DRI forgot about. (True!)

Now, how do we "chain" a bunch of these 1K clusters for, let's say, a 46K file? Good question.

Those of you familiar with 8-bit Ataris will remember that the last 3 bytes of each sector were used to point to the next sector. For instance, if the first sector was number 33, then the last 3 bytes of sector 33 would point to the next sector to be used. Typically, they're in order, so we'd see: 33 → 34 → 35, and so on.

The ST's DOS uses the same idea, but doesn't store the actual cluster number in the sector data. Rather, it gathers all these numbers together into a one-dimensional array, called a FAT.

Each FAT entry corresponds to one cluster and is 12 bits long. There are 5 sectors dedicated to the FAT, and they pack all the FAT entries together as tightly as possible into this small space. Result; lots of bit shifting, as 12 is not an "even" number to the ST.

Anyway, don't reach for your hexadecimal calculator yet. CHK does all the dull bit-fiddling for you, and returns the FAT to what it really should be—a one-dimensional array.

If we were in BASIC, we'd make a FAT like this (assume we have 360 clusters):

DIM F(360)

Tough, right?

Now, let's assume we have a file named DAVE. DAVE is 40 clusters long (40K). How does the operating system find the start of DAVE, then trace through all 40 clusters? Like this:

First, we look up DAVE in the "directory" (just a set of filenames on your disk—and, just as importantly, the first cluster or FAT entry in that file). So, let's say that the directory entry looks like this:

DAVE First Cluster: #23.

Okay, so we go read cluster 23, for 1,024 bytes of data. Where's the next 1,024?

For this, we go to the FAT. We look at the FAT entry for cluster number 23, or FAT(23); it says 24. This is the next cluster in the chain.

We go read cluster 24 and, to find the next cluster, we look at FAT(24). And so on. Finally, we'll reach a FAT entry that says end of file. (For you hex types, it's $FF7 through FFF—12 bits, remember.) DOS, at this point, knows the file has come to an end and quits reading.

Hence, the FAT is a collection of "pointers" to next clusters; a given cluster's FAT entry just points to the next cluster in the chain.

Thus, the term FAT is a little misleading. NEXTCLUSTER would have been a better name, but where would we computer professionals be without weird acronyms? However, a FAT does end up being a kind of disk map, and it's the only one we've got.

As you can probably guess, when you delete a file, DOS goes through the cluster chain and carefully marks each cluster as "empty" once again, so that new files can use them.

Finally, there is a special "FAT" entry for sectors that are physically bad. Most hard disks today typically have a few bad sectors; it isn't any big deal, as long as the number doesn't grow. At format time, DOS marks those clusters as bad (hex freaks: $FF0-$FF6). Then, from that point on, DOS thinks of them as "used," and never tries to access them or use them in a file.

Now, let's take a typical problem: a system crash during a disk directory access. Let's assume we create a new file, using up a bunch of clusters (and FAT table entries), thus marking them as "non-empty" —but no directory entry ever gets written to point to the first member of the chain. What happens? Those sectors are marked as "used" (nonzero), but, since there's no file associated with them, they can never be marked "unused," or returned to the free sector pool. These are called "lost clusters."

Or, worse: DOS becomes confused (seems to happen a lot) and "cross links" two files together. This is where the same cluster shows up in two files' cluster chains. Now, only one of them is right—and one of the files has bad data in the middle of it.

Though this shouldn't happen, it does. DOS crashes for a variety of reasons at very bad places—the 40-folder limit is a good example. Cross-linked files can happen quite easily.

Well, the first thing you'll see when you've got a cross-linked file is something like a TOS error 35. But the damage may be more subtle, turning up in a piece of code that is only rarely executed, or in a data table you haven't accessed in a while. Who knows, maybe that fourteenth file in your fifteenth subdirectory is cross linked with your DEGAS program file. Without ChkDsk, you'll never know about the damage until you try to access that data.

There's an even more horrible possibility (are we having fun yet?), called a "circular link." Here's a typical example:

Cluster 45 → 46 → 47 → 45 → 46 → 47...,etc.

When DOS tries to delete such a file, it's got a problem. I'll leave the results to your imagination.

CHK to the rescue.

So, let me introduce you to CHK. CHK gives your disk a thorough test, looking for things that have gone wrong at the DOS level. If it finds anything, it will let you know; you can then use many of the CHK options to fix the problems. At least you'll know something is wrong—and the file that's damaged. That beats the alternative!

When you run CHK, it gives you the standard desktop TOS Takes Parameters prompt. You type in, at a minimum, the disk drive you want to check over. If you just press RETURN, CHK will give you a short summary of how you run it, to jog your memory (a nice feature).

When you specify that drive number, CHK gets to work.

First, it creates a new temporary "FAT" in memory, with all clusters empty. It traces each file's cluster chain on your disk, and makes sure none of the clusters are cross linked (used in another file). It also checks for circular chains. Then it marks those clusters as used in the temporary FAT.

Next, it compares the temporary FAT, a kind of real-world FAT, with the actual FAT on disk. Let's say the real FAT has cluster 23 marked as used, but that sector shows up as free in the temporary FAT. Because no file uses it, it's an "orphaned" cluster. DOS won't use it because DOS has it marked as being in use, even though the cluster is doing nothing but taking up space. CHK will also see if 23 points to another cluster—a chain of orphaned clusters!

CHK then produces a summary of your disk drive as it really is. It tells you the total capacity of the drive, the number of folders you have, the number of bytes taken up by those folders (overhead), the number of bytes taken up by programs, how much of your disk is physically bad, and, finally, how much room you have left on the disk.

Go ahead and run CHK a few times with no options. It won't change anything, or do anything except observe (no writing), unless you explicitly tell it to, so there's nothing to be afraid of.

Special note for you hard disk users: the 1986/1987 version of the ROMs has a bug known as the "40 folder" bug. Basically, if you "touch" more than 40-folders in any session (RESET to RESET), you crash. "Show Info" touches every folder in any drive it does a Show Info on, so if you have more than 40 folders, you've got a problem.

Anyway, this is pretty well known, and I mention it only because I wanted to tell you that CHK does not aggravate the 40-folder problem. In fact, if you have a disk with more than 40 subdirectories, it may be the only safe way to get a Show Info of that disk.

Okay, let's assume we found some orphan clusters. How do we go about telling DOS we want them back?

CHK -F: (Fix) Orphan Fix.

If any orphaned clusters are found, CHK will produce a warning message. But you've still got no way to recover them (to mark them as empty so another file can use them). What you need to do now is run CHK with the parameters (drive):-F (where (drive) is the drive identifier). The -F option will create some temporary files. Each temporary file (named: FILEl.CHK, FILE2.CHK, and so on) points to one lost cluster chain.

What you should do at that point is look carefully at these FILEs. They may be important data that has been accidentally lost. For instance, a system crash that orphans a chain may still leave you with good data on the disk—you can edit and copy the FILE files.

If, as is common, you find only trash in the FILE files, just delete them. DOS will then do what it should have done in the first place—return them to the "free sector pool," or mark them as empty.

CHK -V (Verbose).

CHK will, if you ask, produce a listing of each and every cluster in the sector chain. It's something you'll want to do once, to see how a disk is really laid out. Be forewarned: this listing can be gruesomely long on a big disk drive, like a hard disk.

This option can also be helpful in determining exactly which clusters are used by a file, such as for direct sector editing. The formula for converting clusters to sectors, however, is not trivial.

At the end of -V, you get a map of your disk, with each cluster marked as either used or empty.

CHK -T (Talkative) -S (Stats).

Not being content with writing a merely useful program, we decided to give you the "boot block" information, as well. This is the data out on the very first sector of the disk, that tells the operating system critical values on said disk, like: how many clusters are there, what the "name" of the disk is, how many sectors are there per track, how many sides, and so on.

If you run CHK with the parameter -S, you'll be treated to all this information, and nothing else. If you run CHK with the parameter -T, you'll get all this information plus the usual CHK of your FAT and directory.

CHK -M (Map).

There are times when having a map of the free areas on a disk can be quite useful; say, when using hard disks. Hard disks suffer from an extremely slow empty-FAT search. When writing to a hard disk, the farther away from the start of the disk you get, the longer it takes. We're talking up to 30 seconds longer on a hard disk!

Hence, if you concentrate your present files toward the end of the hard disk and leave the first clusters free for your work, you'll get very fast write response from your hard disk.

CHK with the parameter -M gives you a map of each and every cluster on the hard disk. Each entry, if empty, is marked EM. If it points to another cluster, it's marked with the number. If it's a bad sector or end-of-file mark, it's appropriately marked.

What you want on a hard disk is mostly EM clusters toward the start of the disk, where your temporary files will be, and then used sectors toward the end.

For more information, and a technique to get back your hard disk's performance when it begins to slow down (as a result of this phenomenon), see "Restoring Your Hard Disk's Performance" in next month's ST-Log.

The cross-link blues (CHK - number).

Go ahead. Walk up to any IBM programmer, and say "Cross-linked file."Be prepared to catch them as they faint. Smelling salts are helpful.

I don't know how it happens. Maybe cosmic rays, maybe some Macintosh enthusiast with a voodoo doll projecting bad karma my way. But files can accidentally manage to use the same cluster. And, since a cluster can only point to one other cluster as the "next in the chain," your files are basically the same chain from that point on.

Then you've got a problem. You've got to get rid of all the files using that cross link (because, odds are, they're all screwed up). But, when you do the first delete that goes into the bad area, it'll delete all those clusters—and then, the next delete you do (on the second file linked in there), wham—DOS will become unhappy with you. That cluster is already deleted!

Plus. . .you may have multiple files, all ending up in that same chain. You need to know each and every file that's butchered, so you can at least know where to begin damage control.

For these, we humbly (well, all right, not all that humbly) offer -(number). Number is a hex (not decimal!) cluster number. You will be told all files that access that cluster. Normally, there will be none or one; that's okay. Two or more is bad news; best go check through and delete all those files.

When you get the dreaded cross link message, run CHK with the parameter -### on the sector involved (like: C: -20A), and find out every file listed as owning that sector. Then go delete them all.

Well, there you have it, ChkDsk, another Dan Moore utility for ST users.

You'll find yourself using ChkDsk almost daily to keep tabs on your system, especially if you have a hard disk. There, it is invaluable; there's no other tool like it available. It's a great feeling to run CHK on my 15-meg drive, with some 100 folders, and find out that everything's fine in the directory department; it also gives me confidence that the backups I do won't turn out to be corrupted. When I run CHK daily, I know exactly when any file problems occur, and can fix them.

If GEM and DOS were perfect, we wouldn't need CHK. Since they are not, CHK is the utility to recover any damage they cause.

The author of the CHK program is Dan Moore. Dan is more or less the Tom Scholz of computer programming, especially in C. (Who's Tom Scholz? He runs the band Boston, which just took seven years to produce one album; he's known as a perfectionist.)

Anyway, Dan wrote CHK in C, and it compiles and runs perfectly on the IBM PC. He then ported to the ST, and, since CHK was machine independent, had no problems with the port—it came right over.

Lots of people write C code on the ST; very few write C code that ports directly to the IBM.

Hence, CHK is not only interesting in that it reads tables, FATs and directories directly (ideas you can use for yourself), but in its machine-independent coding techniques. If you're interested in porting your C code from the ST to the IBM, CHK is a great place to look.