IBM images. Will Fastie.
A year's worth of columns has generated a steady flow of mail much, I think, to the postmans's chagrin. The letters vary, of course. About half are just chit chat, or a listing of a favorite program, or a clever(?) one-liner. The rest are questions, most of which do come with a return envelope. I have received letters from all of the countries in Western Europe and Great Britain; from Egypt, Israel, and Saudi Arabia; from South Africa; from Canada; and from Chile, Venezuela, Argentina, and Brazil. The international respondents are usually apologetic about not sending a stamp, but they send coin quite often. Josh's collection, begun modestly as a result of my own infrequent travels, is growing.
Two questions are asked frequently. This column is about those two questions. One I can dispose of quickly, but the other consumes the bulk of this column and, depending on my mood during the holidays, may chew up some of next month's as well. First, the easy one.
The CompuCable Friction Feed Kit
Of all the products I have mentioned, none has aroused more interest than this one. It is a kit of parts that converts an Epson MX-80 printer (or IBM 80 CPS printer) from a tractor-driven, continuousforms printer to one that can take cut sheets.
The first half of the question is "Who is the manufacturer and where are they?' Answer: They are CompuCable Corporation, and they are located at 1440 South State College Blvd., Suite 6-J, Anaheim, CA 92806. The phone number is (714) 635-7330.
The other half of the question is "How are you doing with yours, and do you still like it?' Answer: Just fine, thanks, and I do still like it. For a variety of reasons, I would prefer an Epson MX-80 F/T instead, but I think the CompuCable kit is a reasonable alternative for those of us who bought either the IBM or a pre-F/T Epson. I would like to be able to adjust the tractors for other paper widths, but I haven't really had the need.
I find single sheets a little difficult to feed. Practice is required until you get the hang of it. Envelopes do not exactly fit, since they are wider than the platen and thus overlap the pinwheels. To do envelopes, I have a special addressing program (see Listing 1). I insert the envelope, type in all the lines of the address, correct them if necessary, and initiate the print. The program then delays for several seconds so I can put extra pressure on the rollers. It is awkward, but again, infrequent.
The only problem I will report has nothing to do with cut sheets. The modified printer has a slight tendency to wrap continuous paper around the platen. This contingency is provided for in the design of the kit, but it happens once every hundred pages anyway. The problem appears to be that the kit is less effective at lifting the sprocket holes off the tractor pins than the original Epson equipment.
I still do recommend this product. It certainly has been worth having, and the minor inconveniences are far outweighed by the function for which it was intended.
Getting At BIOS, et. al.
The hard question I am regularly asked is how system functions can be accessed from the higher level languages. The majority of writers ask about IBM Pascal, which unfortunately does not provide access to many of the functions that even Basic supports well. For example, there is no Pascal analog for the Basic statement CLS, which clears the screen. There is no way to make specific IBM DOS calls, or to gain access to the ROM BIOS.
Even with Basic, there are still features that cannot be had, and I have been asked about interfacing both Basic and Compiled Basic to the system. This month I will show you how to do it with Pascal, and at some later time we will take up Basic.
How The System Works
To understand the programs I am presenting this month, you must understand something about the way the PC works. Here is a nutshell explanation.
The PC is delivered with some programs in permanent memory, usually referred to as "read-only memory' or ROM. The ROM contains three sections of code: Cassette Basic, diagnostics, and the "basic input/output system' or BIOS. The BIOS is very important, because it provides the basic services that make the PC run. For example, it contains the program that decodes the keyboard and translates keyboardspeak into ASCII characters. More to the point for my readers, it contains a large chunk of code to manage the video displays, both monochrome and graphic. Access to these routines is achieved through the use of interrupts.
Usually we think of an interrupt as something caused by an external event-- for example, the depression of a key on the keyboard. The IBM PC (using Intel parts) allows the processor to ignore quiescent devices most of the time, attending to them only when they raise their hands and yell for help. The yell is the interrupt, which causes the CPU to stop what it was doing and transfer control to a program that knows what to do with that particular kind of interrupt. When that program is done, the processing of the original program can resume.
The 8088 includes an interrupt instruction (INT) which has precisely the same effect. In fact, the instruction can be used to simulate the behavior of a device even if the device is not present in the system. When used in a program, the instruction is similar in behavior to the more familiar subroutine call. Except for one thing.
The INT instruction requires that a single specific location in memory contain the address of the routine to which control is to be transferred. A table of addresses, one for each of the 256 interrupts supported by the processor, is maintained in RAM memory at absolute locations 00000H to 003FFH. The CALL instruction, in contrast, allows its address to be virtually anywhere, imposing no restrictions.
But so what? Why does this facility bring so much power to a program? In general, it does not. However, it is especially useful for operating systems and is extremely valuable when some of the programs to be called reside in ROM.
Consider this. Suppose IBM found a critical bug in the video driver code in the ROM BIOS. Straight away they fix it and start delivering machines with new versions of the ROM. But what about you and me? Because the video code is accessed through an interrupt, IBM could supply a program that would modify the contents of the position of that interrupt in the table to point instead to the fixed version in normal (RAM) memory. In fact, that is exactly what happened with the Basic bug. Cassette Basic also had the bug, which meant the bad code was in ROM. IBM distributed a new disk version of Basic which did not have the bug. How? By plunking a repaired version of the code in main memory and adjusting the interrupt table to point to it.
As it happens, all the fundamental capabilities of the PC are accessed through the interrupt system, including calls to IBM DOS and execution of Basic statements. So if you are trying to do something outside the scope of the programming language with which you are working, you need to be able to issue an interrupt, something that is not quite as simple as it sounds unless your language is assembly.
PSYSINT Is Born
What we need is a routine that can be called from Pascal to invoke an interrupt service routine. Pascal must provide the information which normally is placed in the CPU registers, and the routine must get that information into the proper registers. Since some interrupts will return information, the routine must provide a mechanism to pass data back to the caller.
Rather than write such a routine from scratch, I decided to adapt (with permission) the routine SYSINT from George Eberhardt's C86 Compiler for the PC. I have modified it primarily for clarity, although I have made it slightly less functional as well. The Pascal version of SYSINT, or PSYSINT, is shown in Listing 2.
Let's take a look at this and see just what it does. The first three instructions (lines 44 to 46) are required for the Pascal environment, and make some adjustments to the stack and frame pointers. The program then uses a "trick' to get the current instruction pointer location by calling a subroutine at the next location. This transfers control to the next location after pushing a return address (DUMMY) onto the stack. This address is immdeiately used to calculate the location of the INT instruction on line 58. The desired interrupt number is retrieved from the parameter list of the subroutine and inserted in the instruction.
I apologize for this technique. I should not write code like this, but it is considerably more obvious than George's perfect simulation of the INT instruction (see box). In general, self-modifying code is something to be avoided.
Once the INT is set, the register values must be inserted into their proper registers. This is accomplished by the REGSIN routine, which PUSHes all the passed values onto the stack and then POPs them into their proper registers. Then the interrupt instruction is executed, and the requested service is obtained.
When the service routine is done, control is transferred back to the location following the INT instruction, at which point the process described above is reversed. The routine REGSOUT gets the values in each register and puts them in the record passed as an argument to PSYSINT.
Great. Now we have this nifty routine, but how do we get at it from Pascal? That's what Listing 3 is all about. It contains three interface routines (CLS, LOCATE, and CHROUT) which dimonstrate how to write Pascal for PSYSINT. There is also a short main program which tries the new features.
The first thing we must do is define a new record type for the registers. This version of PSYSINT supports only the four general purpose registers AX, BX, CX, and DX, but the type REGSET (for register set) also includes the low and high portions of each since many BIOS routines expect values in a half-register. You should notice a peculiar feature about this record: the low half of the register is declared before the high half. This is because the record must exactly match the memory configuration of the register set: the 8088 stores 16-bit values with their two 8-bit halves switched.
EXAMPLE then declares two register sets, SREG and RREG. We will use SREG to hold the register values needed to invoke a function, while RREG will hold any information returned from the routine. (A single register set would also serve if the original values were not needed after the call.) I also have described another WORD, called FLAGS. PSYSINT returns as its value all the processor flags, and we need a place to put them. Finally, PSYSINT is declared to the compiler, showing its three parameters and their types (again mentioning REGSET), the type of the return value, and the fact that the routine is external to this program, that is, not defined within this source file.
The three routines are self-explanatory if you have the IBM Technical Reference manual. All the calling sequences are documented in the ROM BIOS listing at the end of the book. A few comments are in order, however. First, CLS always assumes that the monochrome display in 80 X 25 mode is the display mode, and therefore the value 2 goes in AL. The right way to do this is to find out what mode is currently active and pass that value. The procedure would then be completely general. The mode is stored in absolute location 00449H (or segment 40H, offset 49H, if you prefer) so it can be obtained by CLS if you prefer) so it can be ADR OF, ADS OF, and pointer constructs can be used to accomplish this, avoiding further assembly language.
Second, the routine CHROUT always writes the character to the video display with the white-on-black attribute. There is a call to the BIOS to write the character without changing the existing attribute, but I have been having some trouble with that one, and I didn't have the time this month to deal with it. CHROUT could be modified to include a parameter for the attribute character.
Finally, I have been cavalier in my use of "magic numbers.' We refer to constants like 200, 210, and 20 as magic numbers because there is nothing in the program to indicate what their values mean. In this case, the range 200 to 210 represents 11 graphics characters, chosen to show the purpose for CHROUT.
These examples should serve to demonstrate how PSYSINT can be used to build a library of system interface rouines to meet almost any need. PSYSINT should serve you well, provided ultimate speed is not a requirment, as the routine does impose a small additional overhead upon BIOS, already burdened with considerable overhead itself. And one final note: PSYSINT is not suitable for multitask or -program environments, because it is self-modifying and because it does not prrserve the entire set of registers across the interrupt call. It must be modified to perform a complete simulation of the interrupt and to protect itself from reentrancy problems.
The Development Process
This is not the first time I have used the tools needed to build these programs. However, I got lost briefly in the development process (I didn't order the halfregister pairs in the REGSET type properly at first) and I therefore spent considerably more time with the tools than I cared to. Here is a brief report.
I should first compliment IBM and Microsoft on the consistency of operation of all the language compilers, as well as the linker. These programs all behave in the same way, making it relatively easy to move back and forth between Pascal and the Macro Assembler, for instance. Once you have used one, you have used them all. The only difference is the various options which are specific to each language.
Pascal runs pretty well. Its problem is its size, which makes it necessary to switch disks to get at various pieces of the compiler. This gets old fast. I'll get some relief next week when my 320K disk drives arrive, because I'll be able to combine the two passes of the compiler on a single disk. That will simplify things considerably.
The Pascal manual is generally complete. The description of the procedure and function calling convention, the very thing I needed to understand to write PSYSINT, is terrible. There are a few paragraphs giving some information, and then there is a poorly contrived example.
There is a tremendous amount of information missing from this section, information which is actually scattered around the book. Worse, the index doesn't have entries that match my terminology for the information for which I was searching. The combination of the manual and some experiments finally taught me what I needed to know, but it took much too long.
The Macro Assembler is okay. I wish the defaults for page length and line width were rational, though. Why is it that so many programs don't allow for top and bottom margins, or a binding margin on the left? A line width of 80 is crazy considering how much space the assembler puts on the left side of each line. Well, it can be fixed as you can see.
The linker seems well-behaved enough. I had a documentation problem again, trying to relate the addresses on the map with the actual load addresses in memory. I was using the linker information in the Pascal manual for a while with no luck. I switched to the DOS manual, and finally found the one sentence that got me off the hook. The reason that map addresses don't match memory addresses has to do with the structure of a .EXE file, the type of executable file produced by LINK. .EXE files are relocated into memory when they are loaded, so the actual memory locations are not known at link time. Unfortunately, there is no tool to help with calculating the correct address from the offset given in the map: it is a manual process. The fact eluding me was the method of calculation.
The final tool, and by far the worst, is DEBUG, the IBM DOS debugger. The program is no help at all for any serious work. The most glaring deficiency is the lack of an assembler so that program changes can be made with source language statements. DEBUG does not handle expressions and does not display data in number systems other than hex. It is not a symbolic dibugger, so it does not know anything by name. For example, it is not possible to set a breakpoint at PSYSINT: the absolute address of PSYSINT must be known. Breakpoints are not handled well at all, to the extent that if none are encountered and the program terminates, the breakpoints are left in place. If the debugger knows the program terminated normally, why can't it remove breakpoints? It certainly must know where they are. There are other problems as well.
In summary, the documentation, although complete, seriously complicated my work and caused me to spend much more time than I usually allot for a column. Otherwise the task was straight-forward. A serious debugger is much needed.
BIOS Overhead and Video Displays
I mentioned above that the BIOS has a fair amount of overhead. This is both good news and bad news. Let me take a moment to explain my position.
BIOS is written quite conservatively. It appears that little is left to chance, the authors preferring to err on the side of caution. For example, most routines preserve the caller's copy of many registers, more than necessary in some cases. However, doing so isolates a routine and significantly reduces the chance of an error inside one routine doing damage outside its scope. For a piece of software intended for permanency in ROM at the end of a 13-month development project, such conservatism is admirable. Although problems have been found over the last year or so, there has been nothing so severe as to prevent an application from successful implementation. IBM deserves credit for this, especially because of the volume and complexity of code in the BIOS.
The bad news is that the conservatism does tend to create execution overhead, no place more evident than in the video output section. The problem is so severe that use of the BIOS routines for display output cannot keep up with a communications line operating at 9600 bps. Actually, the problem is more a design mistake, forcing the use of routines not well suited for the purpose.
A case in point: the video section provides routines to output a single character, or a single character multiple times. It does not provide a routine to output a string of different characters, obviously a frequent activity. To make this happen, a program must call the BIOS once for each character in the string.
The video section is very general. It is possible to call for video output while in any display mode; the program makes all the necessary determinations and acts accordingly. But at what price? Frankly, it is expensive. Sure, all the decisions that have to be made happen in a flash, but how many flashes make up a second? Worse, the display adapter cards behave in a peculiar way that requires the software to wait for certain instants before inserting the character into the memory of the display adapter. Once the instant arrives, there is time to insert several characters, but by BIOS design there is never more than one.
The net result of all this is some surprising slowness from the video section of the BIOS. It is possible to see the difference by ignoring the BIOS and just writing the characters to the display adapter memory. Programs written like that are visually fast, and it is possible to make display pages "snap' in and out. This technique happens to work on the mono-chrome display, but it creates snow on the display device when used with the color/graphics adapter.
Is there a solution? For the ambitious software developer, yes. Because the routines are all accessed through interrupts (and you thought this part of the column didn't have anything to do with the main theme, eh?), it is possible to replace the video section with code of your own choosing. It is a lot of work, but it is possible. I think this is another of the PC's strong suits.
Table: Listing 1.
Table: Listing 2.
Table: Listing 3.