Classic Computer Magazine Archive START VOL. 4 NO. 5 / DECEMBER 1989

BY DOUG WHEELER AND DAVID SMALL START CONTRIBUTING EDITOR

The Quest for Speed

Accelerator Boards and Software for the ST

One of the hottest trends in the ST market today is speeding up your ST. There are many ways to do it: disk accelerators, "software Blitters" and "16-MHz accelerators" are available today and doubtlessly more products are on the way. Dave and Doug's overview is designed to tell you what these products do and how well they do it, so that you can decide if they're for you!

There are really two ways to speed up the ST. Either the software can be improved with better programming, so that the programs run faster, or the hardware can be sped up, giving the same result. (Of course, the ideal situation is that both of these occur, for the maximum speedup.)

One thing we can't do for you is to identify just where your ST is being slowed down the most, and where you can do the most to speed it up for the least money. This varies from program to program. For instance, if you use a lot of graphics and animation, you ought to focus your attention on speeding that subsystem up, with a Blitter or software Blitter. If the disk is slowing you down, you need to work on the disk system with a hard disk, Twister, cache or RAM Disk.

Accelerating some part of your ST that doesn't need acceleration is like adding (let's say) a bigger fuel pump to a car whose fuel pump is already just fine; it won't do you much good. So keep your particular system in mind as you read this article and where you could use the acceleration the most.

Ways to Speed Up the ST Without Replacing the 68000
It is possible to speed up your ST without replacing the 68000 through either software or hardware. Software-only acceleration offers some interesting alternatives and often at the best price possible: free! Many of the alternatives given here are available for the price of a phone call to a local BBS, are on a local user's group disk collection or in the massive BIX/Compuserve/GENie libraries.

Now if the programs you use are rewritten to be more efficient and run more quickly, that'll help speed up your machine. CAD-3D version 2.0 was rewritten to be much faster than version 1.0, for example. Sometimes this happens in the next version of a program, sometimes not. Alas, you're not in control of this process.

However, the programs you use often rely on features of the ST's built-in operating system that you can speed up. Two that you can most often help are the disk system and graphics displays. Most ST programs rely on the ST's operating system to run the disk drives and the graphics and you can change these modules to your advantage; thus, most programs will show an improvement. Let's discuss disk drives first.

Disk Drive Accelerators
If you spend a lot of time waiting on disks, then a disk drive accelerator is probably for you. It is unlikely that speeding up the 68000 will matter if you're waiting on the disk; all that will happen is that you'll wait 16 million times per second instead of eight million times per second!

The way to judge this is by how much time you spend waiting for the disk to do something (load/save a program, get data, store data), be it a floppy or hard disk. If you spend most of your time "number crunching'' (for instance, recalculating a spreadsheet), then a 68000 accelerator will do more for you; if you spend your time copying files, a disk accelerator will help more.

Twister
If you use floppy disks, the first and best accelerator is the Twister format, which debuted here in START several years ago. Twister changes the way data is laid out on the disk, letting the ST access it up to twice as quickly as the standard ST format. Twister proved so successful that Atari adopted it in the Mega ST machine's operating system (TOS 1.2), so you may already have it! If not, Twister format is available from START in back issues, as an option in the shareware "DoubleClick Format 3.01" program and as an option in NeoDesk.

If you're on a low budget and cannot afford a hard disk, we recommend Twister or one of its derivatives for speeding up your floppy disks. The ST features a fine floppy system, capable of the maximum possible speed that any manufacturer can get out of a floppy, if you use Twister format. All that's required to use it is to format disks in Twister format; after that, reading and writing to them accelerates, particularly on large files.

RAM Disks
Another alternative is a RAM Disk. This converts a section of your ST's memory to a pseudo-disk that is very, very fast. Of course, there are some trade-offs; you lose anything in that "disk", when you power off or crash and if you don't have much memory in your machine, you cant afford to give 400K to a RAM Disk. There are many RAM Disks available as either public domain or shareware, some of which are even Reset Proof (whose data will survive you pressing RESET, but not powering off).


If you're on a low budget and cannot
afford a hard disk, we recommend Twister.


Cache Programs
A third way to accelerate disks is with a cache. A cache is a small, dedicated RAM Disk that remembers certain parts of a disk in memory, so that they don't have to be re-read; typically it remembers the directory area of the disk, plus the last few sectors read in. Strictly depending on the program, this can range from no-help to great acceleration. There are many cache programs in the public domain/shareware, and Atari even now has an official cache program.

Get A Hard Disk
I realize that hard disk doesn't fit under software accelerators, but it's a logical place to mention them, since we're talking about disks. A hard disk will greatly increase your disk speed. Some good news for you: hard disk prices are constantly dropping. We recommend comparing prices here in START, at your local dealers, and mail order for the best buy. We can strongly recommend, from personal experience, hard disk units from Berkeley Microsystems, ICD, and Supra (names are in alphabetical order); all three companies have been in the business a long time, have excellent support and good products. (There are other companies that make hard disks, which we simply haven't seen yet, for instance, the new removable hard disks may be the next wave in hard disks.)

We can also say that for nearly all ST users, a hard disk is the first Best Buy to speeding up your system overall and getting more out of your ST. We feel that even 16-MHz CPU accelerator companies would agree with us that your money is best spent first on a hard disk, then on their accelerator; hard disks typically are ten times faster than floppies, accelerators typically 1.5 times faster than a regular ST. You can see where the big gain is to be made.

Don't underestimate the power of the ST; it is capable of scorching speed on hard disks. An ST hard disk, properly configured and with the right software, will outrun the hard disk of a 16-MHz Mac II!

If you do have a hard disk, though, and don't have the brand new TOS 1.4 ROMs in it, you'll find that as your hard disk fills up, it takes longer and longer to write to it. This is a bug in the ST's operating system, commonly called the FAT bug. There are three ways to fix it.

HARD DISKS: FIXING THE FAT BUG

Hard Disk Turbo Kit, FATSPEED, TOS 1.4
The first thing you can do to speed up a hard disk is to "defragment" the data on your hard disk. This moves portions of files together so that they can be accessed in the minimum time. Second, you can move the data on your hard disk so that new data is written towards the start of the hard disk; the Hard Disk Turbo Kit program (formerly Tuneup) from MichTron will do this quickly and easily. The next time you write, the difference may startle you; on a near-full 15 megabyte hard disk, disk-write times went from 45 seconds to six seconds flat!

Another alternative is the shareware FATSPEED.PRG program, which goes into your AUTO folder. This replaces the slow Atari FAT routine, and greatly accelerates writes to the hard disk. We strongly recommend this program to anyone with a hard disk and a version of TOS lower than 1.4.

TOS 1.4 has a rewritten FAT lookup routine that fixes this problem. (Don't use FATSPEED with TOS 1.4.)

Until we installed TOS 1.4, we used both FATSPEED and MichTron's product regularly and can strongly recommend both. One note for Hard Disk Turbo Kit users: please don't use it on a full hard disk. Give it a megabyte or more of workspace.

Replace The Mechanism
Finally, if you have a slow hard disk (say, a 65-millisecond Seagate ST-225 that's in most Atari-built units), don't neglect the speed advantage of a faster hard disk (for instance, a 28-millisecond 40-megabyte ST-251) and of using an RLL hard disk unit. While this is on the upper end of ST users, as a general rule, as you accumulate more data on your hard disks, you have to move it around more often; the faster the hard disk, the better. Usually replacement is a simple matter of moving the cables from one unit to another and reformatting.

This pretty well covers accelerating one major subsystem of the Atari: disk storage. Next, let's cover accelerating the graphics subsystem.


The CMI Processor Accelerator is actually more of an expansion board than
just an accelerator. You can add a 68881 math Co-processor, a Blitter chip
and replacement ROMs for fastROM access. Adding all of these chips runs
the price up to over $500, but for particular applications, it may prove to be
worthwhile. Without them, the performance improvement was approximately
that of the J.A.T.O. Board.

As with all these sections, please keep your individual needs in mind. For example, a software graphics accelerator does us little good if we're not doing graphics! A large percentage of our time is spent waiting on the computer to assemble and link yet another revision to source code and there is no benefit to that process from a graphics accelerator; only a faster hard disk or faster CPU will help. Yet for those times we are editing code, having a faster display in the text-editor window is well worth it.

Software Graphics Accelerators
There are two forms of graphics accelerators. One is software and the other is the Blitter chip. Let's take software first.

Whenever a program needs to show you something on the video screen, such as opening a window, it calls the Atari's built-in operating system, stored in permanent ROM (read-only-memory) chips inside the ST, to do this. The ROMs were written in a real hurry back in 1984/1985, are not optimally coded and have not been fixed. Thus, text isn't written to the screen as fast it could be, line-draws aren't as fast as possible, and so forth. (By the way, if you want to see truly amazing speedout of routines written for maximum performance, try the Tempus text editor; it is "what could have been" if the ROMs had been optimized.)

If you install a replacement for these graphics routines, you'll see great performance increases. For instance, when Showing a file to the screen, it can literally fly by too quickly to be read with a software accelerator. Word processor screens speed up, particularly in scrolling and animations speed up greatly.

There are two software graphics accelerators we are aware of. The first is TurboST, written by Wayne Buckholz, an assembly language wizard in Florida. It's sold by Soft-Trek (see the end of the article for a list of prices and addresses for all these products). When we first met Wayne, he was demonstrating a small circle-drawing program that plotted 600 circles per second, at varying radiuses; we were amazed! Wayne's taken his graphics and assembly language skill and applied them to the ST with excellent results. The current version of TurboST is 1.6; it fixes some bugs in previous versions, plus accelerates more of the slow ST graphics routines.

The other is Quick ST, written by Darek Mihocka, who is otherwise famous for his Atari 800 emulator for
the ST, another feat of assembly language wizardry and programming for speed. Darek's emulator approaches the absolute theoretical limits an emulator can achieve! Quick ST is shareware; this means you download it or get it from a BBS for free, then send money ($15) if you like it and use it. Incidentally, the source code for Quick ST is also available; this is a very generous move on Darek's part and gives neophyte assembly programmers some terrific examples of ultra-optimized code. There are many tricks to high speed coding that are best learned by example.


TurboST is an amazing feat of
programming on Wayne Buckholz' part.

Both of these programs come as desk accessories; to install them, you just put the .ACC file on your startup disk, be it floppy or hard disk. That's all. The difference will show up immediately the next time you do something to the screen; windows seem to pop open and snap shut, text scrolls by very quickly and graphics operations really fly.

There is some debate currently about which of these is faster, but from a user's perspective, this is just a debate on if they're very good or very very good. Either way is good news for the user! We recommend trying both and sticking with the one you like.

We use ThrboST on all of our machines.

Blitter Chip
When Atari released its Mega series of computers, it added a special chip whose only purpose was moving things around in memory very quickly: the Blitter chip. (Some say Blitter stands for Bit-Block-Transfer). In any event, if you're doing graphics animations, line drawing and so forth, the Blitter will definitely give you a speed advantage.

Atari long had announced that the Blitter would be available as an upgrade for 520/1040 owners. However, recently that promise was withdrawn; Atari said the conversion unit could not meet FCC specifications. There are now aftermarket manufacturers offering kits that let you put Atari's Blitter into an older 520/1040 machine.

If you have a Blitter, you will notice greater speed in such things as desktop operations (windows pop open and snap shut), text scrolling faster than you can read it, and so forth. When you have the Blitter turned on, Atari redirects graphics operations away from its old ROM graphics drivers to the Blitter, which does these operations in hardware.

However, the 68000 is a very, very fast processor. When properly programmed, it can outperform the Blitter in many operations! Hence, both Quick-ST and TurboST have published benchmarks which show them outperforming the hardware Blitter.

Before being convinced that the Blitter chip is the way to go, we strongly recommend going to a local store and trying TurboST version 1.6 against the Blitter. We did this comparison and found that TurboST is generally faster. Again, this is an amazing feat of programming on Wayne's part.

If you have a Mega, then you have a Blitter already. There's not really much advantage to you in installing TurboST, since you have its hardware equivalent already!

We have both Megas and STs. On the non-Megas, we use TurboST; on the Megas, we sometimes use the Blitter, sometimes use ThrboST and have noticed no difference between the two high speed modes.

Replacing the 68000
So far, we've covered disk drive and graphics acceleration. Now, let's cover speeding up the very center of the system: the 68000. (Incidentally, we obtained all of these boards to test compatibility with Spectre 128; they all work.)

Now you have to keep things in perspective. While accelerating the 68000 is nice, the truth is that with our present products, you don't even get twice the normal speed out of them. Now if what you are doing needs all the CPU help it can get, then fine, go with a 16-MHz accelerator and you'll find yourself sped up around one-third with the fastest board listed here.

But we'd like to caution you again that if your ST is slowed down by its disk drives, you need to handle that siituation before accelerating the 68000 or you won't see much gain. Twister will double the speed of your floppies; a hard disk will go 10-30 times faster than floppies. Similarly, TurboST (or a blitter chip) will do a lot more for your graphics, dollar for dollar, than any 68000 accelerator; they'll go 5-30 times faster than the old ST graphics routines.

Here's a quick way to judge. We call it the ''toe tap'' test. Do you find yourself tapping your toes, waiting, on:

  • A floppy disk to finish loading or saving programs/data? If you're librarian for a computer club and copy disks a lot, you spend lots of time like this.
  • A graphics animation to finish? (Other than number-crunching animations, like the Cyber series or ray-tracers.)
  • The computer to get done calculating something and finish? A spreadsheet working away would be a fine example here.
  • If the first, work on your disk system. If the second, work on your graphics system. If it's the third, add a CPU accelerator.
Now, assuming you need to accelerate the CPU itself, let's examine the 16-MHz accelerators on the market.

16-Mhz Accelerators
Do these accelerators run at 16 MHz? Well, no. They run at a mixture of eight and 16 MHz. Some average possibly 9 MHz; one averages possibly 12 MHz.

To explain how they run, we need to tell you what a "MHz"is (of course), and talk about machine cycles. We're deliberately keeping the theory short and in English here, so people who aren't into technotalk can still understand it.


You'll wait 16 million times per
second instead of eight million times per second!

The ST does everything in machine cycles: stashes one word of data somewhere, executes one part of a 68000 instruction, sends a certain number of dots to the video screen. The entire ST is designed around eight million cycles per second (cps), also called 8 MegaCycles. In commemoration of Hertz's work in electro magnetics, this electrical unit was renamed to the "Hertz", so today we speak of ''MegaHertz'', or ''MHz''; the ST is thus called an "8-Mhz" machine. This is the same "Mhz" you see in other computer's advertisements.

Eight million of these happen every second that your ST is powered on-quite some performance!

Usually, the most used thing in a computer is its internal read/write memory, and the ST is no exception. Only one thing can use memory at a time, period.

Video: The Biggest Memory Hog
The biggest user of memory is video, believe it or not. The ST handles video by remembering what's on the video screen in "video memory." This is the same old read/write memory, just like any other, except that whatever is there is also displayed on the TV screen. This process happens over and over again, 60 or 70 times per second (depending on monitor type).

Generally, the biggest problem computer designers face is this video memory; for, once you start up a TV picture, you can't wait. You have to pump out the video signal very quickly according to the TV's needs. While the 68000 processor can easily wait, the video can't wait, so in other computer designs, the 68000 waits for memory while the video is active.

Now on an ordinary computer, since video has to have priority this would mean that the 68000, which is trying to use memory, has to wait while video happens. The ST's designers did a very clever trick to get around this.

First, they began with 16-Mhz read/write memory (RAM) chips. As the number implies, these chips are capable of 16-million machine cycles per second. Then the designers devoted each even cycle to the video (at 8 mhz) and each odd cycle to the 68000 (at 8 mhz) back and forth. Hence, the 68000 and video share memory, yet don't trip over each other, nor is one forced to wait for the other.

This is one of the big reasons the ST is a low cost, high performance machine; the 68000 and video stay out of each other's way.

Now, the question you're probably asking is, how can a 16-MHz accelerator work? Memory is half taken up by video, so the fastest the 68000 can get to memory is 8 MHz. And you're absolutely right; we're limited to 8 MHz.

Also, other chips that the 68000 periodically talks to, such as the video shifter, DMA Controller and whatnot are all limited to 8 MHz; try to run them faster, and they make mistakes. The entire ST was designed around 8 MHz, and the designers saw no point wasting money by making some parts able to run faster.

So What's a 16-Mhz Accelerator?
So what is it that the ''16 Mhz accelerator" companies are selling?

Generally, they're selling the ability to take a 16-MHz 68000 and run it in the ST's 8-MHz environment. In a few situations, the 16-Mhz chip will be unleashed and be able to run full speed; you'll notice big acceleration then. But most of the time, the 68000 will be chained to 8 MHz, forced to slow down to live in the 8-MHz ST.

For instance, if the 16-MHz 68000 tries to go to memory at a time when video is using it, the 68000 "wait states' or temporarily stops. Next cycle, it tries again. Video will have freed memory, so the CPU can continue. But the net result is you get 8-MHz performance.

The cold reality of all this is that two of the three "16 MHz'' accelerators give you around 1/10th better performance (9-MHz ST) than a stock ST and the third gives you around 1/3 to 1/2 better performance (12 MHz), depending on what software you use.


At the top is a prototype of Fast Technology's Turbo16 board and below It Is
the production version. By using surface-mount technology, Fast was able to
shrink the board so that it's very little larger than the 68000 socket-and most
of the chips Sit under the chip! The Turbo16 board proved to be the best
performer of the three, but performance was highly application-dependent.

Even getting these marginal performance improvements is a hardware engineering miracle; the people doing it are the equivalent of Wayne Buckholz or Darek Mihocka in hardware.

JRI's JATO Board
The first of the 16-MHz accelerators was the J.A.T.O board from John Russell Innovations. We have to confess to some bias on our part with John: we know him personally and like him.

Now the technical terms here involve things like "bus cycle", "data window", "chip select", and "wait state", and are not understandable to anyone except a fresh Computer Sci graduate. So we'll translate to English; pardon if it's not technical enough. We honestly don't feel that passing along a great deal of technical talk will help you in your decision about purchasing these products; performance figures will.

John noticed some slop in the ST's timing-a way to sneak an added memory cycle to the CPU every now and then, when video wouldn't mind. It works primarily on RAM chips that are 120-nanosecond rated or faster; fortunately the majority of 150 nanosecond RAMs on ST machines are pessimistically rated. So John whipped up a prototype board that gave the ST about a 12 percent speed increase (1/8th).

John was busy with other things; his 4096-color board and the Genlock. His friends asked him why he didn't market this accelerator board. He shrugged and said that well, the performance increase isn't awesome, but it isn't costly, either. So the unit ended up being priced at $99, which is a very fair price.

Physically, the JATO board looks like a 68000 with a little daughter board glued on top. To install it, you unsolder and remove your 68000, then solder in a socket and plug in the JATO board. As a neat add-on, John added an LED and shut-off switch; the LED lights whenever the board is accelerating something.

Performance-wise, the JATO board gives you a barely visible increase in speed-1/l0 th, depending on what you're doing. However, balancing this is its low cost.

If you're on a budget, this is the obvious choice in CPU accelerators.

Creative Microsystems, Inc. Processor Accelerator Board
The next entry into the 16-MHz sweepstakes is the CMI board.

We don't know all the technical details of the CMI board; those are trade secrets and we wouldn't give them out if we'd designed it either! Apparently, they are not using the RAM-timing trick John Russell used; they are, however, getting a little more performance out of the ST ROMs than normal. (The ROMs hold GEM and other things.)


A quick way to judge whether you need
an accelerator board is the "toe tap" test.

Installing the CMI board isn't too difficult. You can either remove your old 68000 or clip some of its leads and solder on top of it (harder, in our opinion), plus, you need to run three wires to your ST circuit board.

Once again, we see around a 1/10th performance increase. However, the price is a bit steeper than the John Russell board: $299. Now; of course, you'll say, wait a minute, that doesn't seem like much of a deal! Well, there's more to the story. The CMI board also gives you sockets to plug in various things to help your ST's performance. The board itself is physically much bigger than the other two to hold circuits to make these sockets work.

When we talked with them, CMI's staff were honest about their board's acceleration performance: it wasn't great. What they emphasized were the expansion ports on their board; they consider it primarily an expansion board, not an accelerator board,

So if you purchase the CMI board by itself and add nothing to it, you're probably better off getting a JATO board; the performance increase will be about the same. However, the add-ons are interesting.

Blitter. The CMI board lets 520 and 1040 owners plug in a Blitter chip. And since Atari won't market a Blitter add-on, at this time CMI is your only way to get a Blitter.

Before you go and buy one though, see the above discussion of Quick ST and TurboST, which outperform the Blitter - in software. Quick ST is free to try and TurboST doesn't cost much. Also, there is some question as to whether or not you can even purchase a Blitter from Atari. Some after-market sources have a few, but Atari is ultimately the source of these chips. If you do decide to go with the CMI board, be sure that you can get the Blitter first.

Naturally, if you install a CMI board with Blitter, you will get performance the same as a Mega with a Blitter; screen draws and animation will improve. You'll also get the 10% or so speed increase that the raw CMI board gives you. (We have to say "or so" because the speed increase occurs in some places and not in others, but it's a fair-enough average.)

fastROM expansion. FastROM expansion requires you to either copy the Atari ROMs to (expensive) big EPROMs and mount them on the CMI board's sockets or to bend up a couple of pins on each of the six ROMs in your system and run wires to them.

Here things get a little technical. Inside the Atari you have 192K of permanently stored program in read-only-memory (ROM). This is different from read-write memory (RAM), where programs load into off disk. ROM is there forever and contains GEM, drivers for things like serial ports and disks and so on.

The idea behind fast ROM is to run these ROM chips as quickly as possible. Right now, they're run a bit more slowly than is necessary; if you can run them faster, you'll see more performance from your ST. Ah, but when? Whenever you're running from ROM rather than RAM. So the question becomes, how often does a program run from ROM versus RAM?

Unless they're continually drawing things, programs do not continually use the ROMs; they only drop into it every now and then, such as when you pull down a menu, select a file name or open a window. Most mainline operations of a program are in main system RAM and are loaded off disk. Here, the ROM acceleration will not help.

If you run a program that uses the ROMs heavily then yes, you will see a considerable improvement. The problem is that most programs don't use the ROMs this way; the ROMs are far too inefficiently coded to allow it. For example, if you're doing high speed animation, you don't call the ROM line-draw routines; you draw the lines into video memory yourself. Note that once again Quick-ST and TurboST enter the picture; they redirect the ROM drawing routines to better-written RAM routines, which would not be accelerated with fastROM.

So we can positively say that the net result of fastROM is that ''your mileage will vary.'' It is possible to write programs that will show both big speed increases and almost no speed increases from fastROM; you'll probably fall somewhere in the middle, depending on what you do. Your Desktop operations will probably speed up some; when you enter a program, you'll probably slow back down. Beyond that, we can't say for sure.

68881 Floating Point co-processor. This requires that you purchase a 68881 FPU chip. The 68881 chip is a Co-processor; it works alongside the 68000 to do floating point math at very high speeds. It far outruns the floating point math available for the 68000.


The diminutive J.A.T.O. Board from John Russell Innovations attaches directly
to the 16-MHz 68000 chip. Like the other two accelerators, the J.A.T.O. Board
requires you to unsolder your present 68000 and solder in a socket. While
performance of the J.A.T.O. Board was not the best tested, its price was by
far the lowest. It's a good value.

The problem is that the ST hasn't really had 68881 facilities available for it and there are few STs with 68881s in them. Thus, very few programs have been written to take advantage of it. And here's the catch: unless the program is specifically coded to use the 68881, the 68881 doesn't accelerate anything. You get no speed increase at all.

The one commercial software package we know of that uses this chip is ISD's DynaCADD. There are a couple of public domain programs that also use this chip, such as ray tracers.

When using this chip, you'll note a big increase in floating point speed -10 to 30 times; the problem is in finding any software that will use it! Our opinion? By itself, the acceleration isn't worth the $299, especially with a $99 competitor doing just as well. If you want a Blitter badly, this is the way to get one; however, you will he equally served by software Blitters. The fastROM expansion is just too unpredictable for us to have any opinion on. On the 68881, if you're running DynaCADD or your software specifically says that it takes advantage of a 68881, then we'd seriously consider this option. If not, you will gain nothing and spend some serious money on the chip.

FAST Technologies Turbo16
The third board to arrive at our lab was the $299 Turbo16. This is a board as small as the JATO board; all the chips fit underneath a 16-MHz 68000, surface-mounted. There's a number of chips all packed in there.

We weren't expecting much from the Turbo16, since the last board that size we'd seen was the modestly priced J.A.T.O. Installation involved a socket for the 68000 (again) and we ran the one lead required to a 16-MHz signal source as required (it's not a big deal).

Our first clue that we were wrong to prejudge the board was in opening and closing windows. They popped open and shut as though a Blitter were active or TurboST was loaded. (We then loaded TurboST on top of the Turbo16 and things became too fast to believe!)

A few benchmarks confirmed that there was between 30 and 50% speed improvement!

Nor was this a special-case speed improvement. Our normal uses of the ST is for ST programs (such as Tempus, Microsoft Write and Calamus) and in Spectre development, where we spend forever in assemblers and linkers. Tempus, which is already quick, became almost unusably too quick. Spectre assembly/link time dropped by a third. We even saw a 30% increase while in Mac mode under Spectre emulation!

In short, here was an accelerator that made a fairly big difference.

To this day, Turbo16 boards have lived in our STs; it looks like a permanent installation. We can give a no-problems report after two months of use. (One minor bug that affected Calamus and Tempus somewhat has since been fixed; our second unit shows this.)

The Turbol6 uses a new idea, a "high speed memory cache". On-board the Turbo16 is 32K of high speed, static RAM chips. A cache works like this: whenever the CPU needs to read from memory, the Turbo16 first makes it ask the high speed cache if the cache has that memory value stored there. If so, the cache supplies it to the CPU and the CPU moves right along-at 16 Mhz-not bothering with ST memory and with the 8-MHz video speed limit. If the memory value is not there, the value is loaded from main memory at 8 MHz and also stored in the cache, for next time. Next time through, the CPU gets it at 16 MHz.

The idea is this: Most programs spend a lot of time in loops, doing calculations, screen memory moves or whatever. If the loop's instructions are in the cache, there is no need for the 68000 to slow down to 8 MHz; instead, it runs at 16 MHz the whole time. 32K of cache memory is enough to capture most big loops and give big performance.

Now, of course, your mileage will still vary. We can write programs that will deliberately upset the cache and then the performance of the Turbo16 will be ST-like. But in the real world, this doesn't seem to happen; most commercial software we run shows, on average, a 1/3 increase in speed.


You should always take benchmarks, particularly
those in ads, with several large grains of salt.

We have had no compatibility problems with Turbo16 or with any of the boards, for that matter; the one small bug Thrbo16 had with a couple of programs has since been fixed, and it wasn't serious.

Benchmarks
As a general rule, you should always take benchmarks, particularly those in ads, with several large grains of salt. A benchmark can be written for nearly all of these boards that will show large speed increases; the problem is that you don't run benchmarks on your ST all day.

We tried to pick a variety of software that would reflect what an average ST user does to give benchmark timings. We also tried to equalize "everything else", such as disk, Blitter and so forth during the benchmarks.

We then also ran the Quick ST Benchmarks. These test a variety of ST operations. However, Quick ST Benchmarks are sensitive to things like moving the mouse, so add another brick or so of salt to these results. We also don't have source code to find out exactly what a "CPU Register'' test is.

In general, though, the tests reflect what you'd expect. Nevertheless, the three-grains-of-salt rules applies. Don't expect to get these numbers unless you're running on equipment identical to ours. Benchmarks have been the subject of endless debate in computer circles and we don't see any sign of it coming to an end this century.

68881 FPU: Expect a ten to thirty times increase in speed in whatever floating point operations you do. If you don't specifically call the 68881 in your code, expect nothing. Our test system consisted of a Mega ST4 with TOS 1.4, Mono Monitor and one double-sided floppy drive. The fast-ROM option of CMI was not enabled and for the Quick ST Bench. mark results shown in Figure 1, the Blitter was off. Figure 2 shows the same benchmarks with the Blitter on and Figure 3 shows results from some ''Real World'' tests.
 
 
Figure 1. Quick-ST Benchmarks-Blitter off
 
8 Mhz
CMI
JATO
T16
CPU Memory
100%
100%
100%
135%
CPU Register
100%
100%
100%
204%
CPU Divide
100%
182%
182%
203%
CPU Shifts
100%
179%
179%
207%
DMA Read
100%
181%
181%
181%
GEMDOS I/O
100%
100%
98%
100%
BIOS Text
100%
106%
121%
149%
BIOS String
100%
105%
118%
141%
BIOS Scroll
100%
100%
106%
113%
GEM Draw
100%
104%
116%
150%

Analysis: We see the T16 in general ahead on these benchmarks. Disk speed is unchanged (DMA Read / GEMDOS I/O); since that's dependent upon the disk, rather than the CPU, that's to be expected. The improvements in the BIOS (e.g., ROM) output tests reflect the expected 1/8th increase in the J.A.T.O. board from RAM timing change. They also reflect the Turbo16 cache improvements.
 
 
Figure 2. Quick-ST Benchmarks-Blitter on
 
8 MHz
CMI
JATO
T16
CPU Memory
100%
100%
100%
135%
CPU Register
100%
100%
100%
204%
CPU Divide
100%
182%
182%
203%
CPU Shifts
100%
179%
179%
207%
DMA Read
181%
181%
181%
181%
GEMDOS I/O
100%
100%
100%
100%
BIOS Text
110%
115%
128%
155%
BIOS String
105%
110%
122%
144%
BIOS Scroll
132%
134%
137%
140%
GEM Draw
133%
137%
145%
190%

Analysis: Once again, the CMI places third, the J.A.T.O. second and the Thrbo16 first, although in general all figures are improved. Note that if the CMI board with blitter was tested against another accelerator without blitter, the figures would be skewed.
 
Figure 3. "Real World Tests-Blitter On
  8MHz CMI J.A.T.O. T16 Units
John Walker 44.59 38.01 37.93 24.31 Sec (100 iterations)
Ray-Trace   +17% +18% +83%  
HiSoft BASIC 1:37 1:33 1:33 1:03 Min (2403 lines @ 75K)
Compile   +4% +4% +54%  
MS Write Load 30K 9.35 9.31 9.26 8.96 Sec
    +.5% +1% +4%  
Search/rep. 1:19.7 1:15.9 1:15.8 0:58.6 Min ("e" with "x")
    +5% +5% +36%  
CAD-3D 2.01 6.55 5.89 5.64 4.14 Sec (draw Superview)
Stonehenge   +11% +16% +58%  
Torus 39.09 35.16 36.45 22.39 Sec (create)
    +11% +7% +75%  
Faucet 21.07 20.16 19.56 16.66 Sec (load and display)
    +5% +8% +26%  
ARC.TTP 1:43 1:42 1:41 1:10 Min (2 files @ 58K)
v.5.21b   +1% +2% +47%  
Calamus Print 1:01 0:57 0:57 0:35 Min
    +7% +7% +74%  
Average increase:   +7% +8% +48%  

Analysis: The ''real world'' results pretty clearly show the J.A.T.O. and CMI boards even in pure acceleration while the Turbo16 cache makes a big difference. Clearly Calamus, the ray-tracer, and CAD-3D 2.01 are using loops which fit within the 32K cache and thus remain in high speed memory. Operations which were slowed by the disk were slowed across all boards (MS-Write load, for instance), and in places the cache did not work that well (CAD-3D Faucet).

Conclusion
We recently decided to upgrade a group of STs, both Mega and non-Mega, to give the best performance; we were writing large programs and the performance was really becoming a problem. This article reflects the knowledge we gained doing that upgrade.

  • To accelerate the disk system, we went to TOS 1.4. Before that, we used FATSPEED.PRG to fix the FAT lookup problem in the earlier ROMs. We also put in fast hard disk mechanisms; anything below 28-millisecond seek rate is fine, and we strongly recommend RLL if you can get it at 1:1 interleave. (The OMTI 3527 controller is a real winner in the RLL competition; check with lCD for availability.)
  • To accelerate graphics, we installed TurboST. While we tried Quick ST, it is just not quite as fast as TurboST, although both were very good. As it turned out, this was a good move; the Blitter chips on some of the Megas would not function with the Zax in-circuit emulator in the lab, so we had to leave the Blitters off.
  • To accelerate the CPU, we installed the Turbol6 board from FAST Technologies for an average speed increase of approximately 1/3.
We don't expect to see faster Ataris until the TT becomes available or until someone makes a 68030 with RAM expansion card for the ST.

As you can see, the entire question of accelerating the ST is a complex one and there are many options. The options we selected turned out not to even be the most expensive and could have been less; if we'd been on more of a budget, we'd have used Quick ST. for example. The Turbo16 board is far less expensive than the CMI board with optional chips installed (around $549). If the accelerator had been a budget item, the $99 JATO board is the clear choice.

Hard disk prices vary so much and are dropping so fast that it's nearly pointless to recommend one; the price will have changed by the time this is printed. We'll only recommend a 1:1 interleave RLL unit.

Married to the fabulous Sandy Small, START Contributing Editor Dave Small is the sire of a wonderful family-and of Spectre 128 and GCR. Doug Wheeler works with Dave at Gadgets by Small. Doug is a GEnie Sysop and widely known for his GDOS expertise. This is his first appearance in START.

PRODUCTS MENTIONED

TurboST 1.6, $34.99. SofTrek, P.O. Box 5257, Winter Park, FI.A 32793, (407) 657-4611.
CIRCLE 61 ON READER SERVICE CARD

Quick ST, $15. Darek Mihocka, Box 2624, Station B, Kitchener, Ontario N2H 6N2, Canada, (519) 747-9452 or on CompuServe as 73657, 2714, GEnie as DAREKM, Delphi as DAREKM and BIX as darekm.

JATO Board, $99.95. John Russell Innovations, P.O. Box 5277, Pittsburg, CA 94565, (415) 458-9577.
CIRCLE 168 ON READER SERVICE CARD

Processor Accelerator Board, $299. Creative Microsystems, Inc., 19552 SW 90th Court, Tualatin, OR 97062; (503) 691-2552.
CIRCLE 169 ON READER SERVICE CARD

Turbo16 Board, $299. Fast Technology, 14 loveloy Rd., Andover, MA 01810, (508) 475-3810.
CIRCLE 170 ON READER SERVICE CARD