Of Diagnostics & Debugging
BY JIM DUNION
A detailed look at the art of debugging the 68000, plus a dream list of what the best-dressed 68000 debugger may be wearing this season.
Working in the department of Neurosurgery at Emery University and Grady Memorial Hospital in Atlanta showed me the similarity hetween the medical diagnostic process and the process of debugging a computer program. I was struck hy the importance physicians place on information they see with their own eyes, and indeed, many of the recent advances in medical diagnosis come from instruments that provide images of our internal bodily processes.
Similarly, it makes sense to improve debugging tools by improving our ability to "see" what's going on inside our programs at any instant. One reason debugging is so difficult is that bugs usually involve some assumption that proves to be untrue. We intend to do one thing, but in reality we do something else. This mindset makes it hard to locate bugs, because we're so sure that it couldn't possibly he in a certain part of the code. One way of countering this in debugging is to put as much information as possible on the screen and let our visual system recognize when things don't look right. You've probably noticed how much easier it is to recognize the right answer on a multiple choice test than it is to generate the right answer in a fill-in-the-blank test.
BUG, BUG, WHERE'S THE BUG?
You're writing a program, and when you test it, it doesn't work. What now? At the risk of seeming simplistic, there are two things to do: Determine the bug's symptoms, and discover where the bug occurs.
As a general rule, fixing a software bug is usually easy; finding it is the prohlem. Some of the basic tools we use to find it are:
- A breakpoint mechanism: A way of setting a point in memory so that, if the program tries to execute that particular instruction, control transfers to the debugging system.
- A register display routine: To figure out what's going wrong in a program, the first step is to examine the processor registers.
- A memory display routine: Next in importance to the processor state is the current state of memory We obviously need an easy way to examine exactly what's out in memory.
- A single-step mechanism: We want to make the processor execute a single instruction at a time, returning control to us after each instruction. Then we can "watch" precisely what is happening in the program, and see when it goes wrong.
- A trace mechanism: An elaboration of the single-step mechanism. Here we want to run the program in an interpretive mode, automatically making certain tests, recording machine states, and so on.
- .Register-Set Architecture.
- Addressing Modes.
- Instruction Set.
- Special Event Handling.
The first thing we notice in looking at the 68000's register-set architecture is how many registers there are: seventeen 32-bit registers in addition to a 32-bit program counter and a 16-bit status register. The data registers D0-D7 are the true general purpose registers of the 68000. The next group, address registers A0-A6, is used mainly for address operations. There are some restrictions governing which instructions work with the address registers. Also, some operations on these registers don't affect status register bits. All in all, even though the address registers seem at first to be just like the data registers, they are, in fact, quite different.
The status register is divided into two bytes, the system byte and the user byte. These two registers contain a total of 10 bits of status information. Of these 10 bits, two are especially meaningful for debugging purposes, the S bit and the T bit. The S bit determines which of two possible states the 68000 is in, either Supervisor or User. In Supervisor mode, there are several privileged instructions that can be executed, but they will generate an error in user mode. The S bit also determines which of two possible stack pointers will be used.
The T bit, or Trace bit, is very important to debugging. When this bit is on, an exception (i.e., internal interrupt) is generated after each instruction. In effect, this is a built-in, single-step mechanism.
The 68000 has 14 available addressing modes, most of which can be used with any instruction. The main classes of addressing modes include:
- Register Direct Addressing.
- Absolute Data Addressing.
- Program Counter Relative Addressing.
- Register Indirect Addressing.
- Immediate Data Addressing.
- Implied Addressing.
One of the 68000 addressing capability's features is the ability to write position independent code, i.e., object code that can be moved around and still execute without change. The 68000 allows this by providing an addressing mode that is relative to the program counter. Several other processors allow relative branches, but in the 68000 even JUMPS and subroutine calls can be made relative to the program counter.
This is not to say that there aren't some quirks in 68000 addressing. The 68000 designers also put in some features to encourage reentrant code. For instance, you can read data relative to the program counter, but you can't alter data. Obviously, this is to protect the program from overwriting itself. Self-modifying code is considered outmoded these days. Data can be referenced, relative to a base address set up in a register, to encourage separation of program data and code. Though most instructions work with most addressing modes, there are exceptions.
The 68000 instruction set has only 56 instructions. But the ability for these instructions to work in many of the addressing modes and on several different data sizes makes them seem more numerous. For instance, the CLR command (which is a quick way of setting a location to 0) can be written several ways:
CLR (A3) Clears the word pointed to by A3
CLR.B (A3) Clears the byte pointed to by A3
CLR.W (A3) Clears the word pointed to by A3
CLR.L (A3) Clears the long word pointed to by A3
Care must be taken in sizing the data field on which you wish to operate, especially when the destination operand is a register. If you are operating on less than the full-sized register, the remaining portion is unaffected. In practical terms, this means the register's upper bits might not contain what you think. There is a little bit of a "gotcha" here, in that byte operations work on the bottom 8 bits of a register, but on the top 8 bits of a memory location.
Another feature, sign extend, comes from having instructions that work on variable size data. In the CMPA instruction (CoMPare Address, works only with word and long sizes), if you choose the word size, the sign (i.e., bit 15) is sign-extended to the full 32 bits and then the comparison is made. This can lead to some non-intuitive situations. .For instance,
would set the Z condition code if Al contained -1 ($FFFFFFFF), and would not set Z if Al contained $FFFF It is always a good idea to use long versions of instructions when placing addresses into address registers.
Several instructions are specifically included for high-level language support. Some of these can aid the debugging process. When taking a 'snapshot" of the processor state during a program execution, there is always the question of which registers to save before the snap-shot subroutine executes. The MOVEM copies a specified set of registers to memory or back again. This instruction is very flexible, and able to save or restore any arbitrary group of registers. The machine language instruction actually contains a 16-bit register mask where each bit set to "1" indicates that the corresponding register should be saved or restored. Be careful when using this instruction to move 16-bit words from memory to address registers; they will be sign extended to 32 bits.
The DBcc (Decrement and Branch on condition) is a useful loop control command. Remember that the condition specified is the one that makes the program exit the loop, rather than stay in the loop. If the condition is not met, the counter will be decremented and tested for -1. If your loops are off by 1, this would be a good place to check.
An instruction is provided for multiprocess communication. In time-sharing situations, there is a classic problem known as the "deadly embrace," which can occur when two processes that are interrupt-driven both try to exert control at the same time. If a section of code first tries to read a status value, then takes control based upon the value's state, there could be trouble. What happens, for instance, if an interrupt occurs after the reading of the value, but before it is set to a new value? It depends. To help avoid such pitfalls, the TAS (Test and Set) instruction allows you to read, test and set a value all in one instruction. This allows the code to set a semaphore.
If you've come to the 68000 from a less powerful processor, there will be numerous pleasant surprises for you. Examples are: The bit-testing and setting instructions (BCHG, BCLR and BIST), the multiply and divide instructions, the exchange register instruction, some of the conditional branch instructions, and the data movement instructions.
The 68000 contains conditional branching instructions that test individual status register bits. There are also conditional branch instructions that on other processors require several instructions, including BLT, BGE, BLS, BLE and BGT. The most complicated of these are the "Branch if Less than or Equal" (BLE) and "Branch if Greater Than" (BGT). BLE will jump if the Z bit is set, or if the N bit is set and the V bit is not set.
The Bit TeST instruction, BTST, is a weird little instruction used for testing if a specified bit is set. The weird thing is that the instruction's action depends upon whether the destination is a memory location or a data register. The low order bit is specified as bit 0, and the high order bit as bit 7. Numbers larger than 7 are regarded as modulo 8. Memory is addressed, then, by bytes. If, however, a data register is the destination, then bit numbers range from 0 to 31, allowing all the register's bits to be tested. If the number is larger than 31 it is considered to be modulo 32.
Take care with the other bit-oriented instructions also. The 68000 uses memory-mapped I/O, so it is tempting to want to use an instruction like BSET to set a bit in a peripheral status register. What really happens in a BSET instruction however, is a read-alter-write sequence. Some peripherals are set up to become active whenever their address appears on the address bus. Thus, some subtle bugs can occur when you try to activate individual bits. A better approach is to use a MOVE instruction to set the register all at once.
SPECIAL EVENT HANDLING
In the 68000, interrupts and other special events are known as exceptions. Exceptions are caused by external events known as interrupts; those caused by internal events are traps. The 68000 designers included several features to aid detection of program bugs. Specific hardware traps detect the following conditions:
- Word Access with an Odd Address.
- Illegal Instructions.
- Unimplemented Instructions.
- Illegal Memory Access (Bus Error).
- Divide by Zero.
- Overflow Condition Code (Separate Instruction TRAPV).
- Register Out of Bounds (CHK Instruction).
- Spurious Interrupt.
Finally, the CHecK register against bounds (CHK) instruction checks array bounds by verifying that a data register contains a valid subscript. A trap occurs if the register contents are negative or greater than a limit.
BUILDING A BETTER DEBUGGER
Having examined 68000 features that can affect the debugging process, we return to the problem of using Atari ST features to aid in debugging. Color graphics and animation come to mind immediately One example is to switch screens between the normal program display screen and a special debugger screen, which works because the ST has registers that determine where the video RAM (i.e., what the screen displays) is located. Visuals can be done in monochrome or color. I'm partial to color, and sacrifice resolution in order to use color to signify special events (e.g., a data value changing). The real challenge is to find ways to use the machine's graphic capability to display what the processor is doing. Indeed, numerous instruments are available to do this including: logic analyzers, signature analyzers, and performance analyzers.
Let's look at what we might call a program execution space display. Suppose we create a graphic display where a vertical line represents the memory space available to your system. Test the program in "trace" mode and it could display a colored pixel along the memory space line to indicate where the program counter is set. These pixels may even be tracked horizontally, creating a histogram of how many times a particular instruction has been executed. Just by watching such a display, we could get an intuitive "feel" for where the program is spending its time.
Similarly, a series of icons could be created that stand for some of the routines the program might execute. The trace program could highlight the icon of each routine as the routine is entered, thereby displaying the program's rough flow. This might not tell us the details of what went wrong, but it could provide some clues of where we should look more closely.
PROCESSOR AND MEMORY
Once we have built a better display mechanism we still need ways of setting up a particular machine state and controlling the processor. Every debugger available has instructions for setting processor registers to specific values, and for depositing values to memory. However, some are easier than others. A powerful addition in this area is to allow symbolic references to variables from the debugging tool. Typically this means that the debugger must have access to the symbol table used by the assembler or linker or have a provision for defining symbols interactively.
A second way of improving our processor and state control is to allow conditional breakpoints. Simple, unconditional breakpoints are helpful. But ones with which you can, for example, say "Break" if an instruction tries to write to location TEST, are much better. Other types of conditional breakpoints are ones that break on specific instructions, program branches to specific ranges, and data accesses within specific ranges.
A speed control mechanism for single-step mode is also quite useful. Sometimes we just want to watch the overall flow of the program, while at other times we want to watch the details of specific instructions as they are executed.
We should also take advantage of the function keys and Mr. Mouse. These controls can trigger special kinds of debugger displays or processor control. The special function keys can be used for:
- Returning to the program under test.
- Single stepping the program.
- Switching screens between the program and debugger.
- Activating special data displays.
- Scrolling a symbol table display
- Changing the representation of memory window displays.
- Saving the current state to a disk file.
- Activating trace mode.
- Turning on a buffer mechanism to store instructions that have been executed.
ONE STEP BEYOND
All of the features discussed so far can be found on existing debuggers if you look hard enough (though no one debugger has all of them). So what's the next step? Use your imagination. Are there other ways to visually represent the processor's activity? You bet there are. And I haven't even mentioned the possibilities of adding sound. The 68000 is a very powerful processor and the Atari ST combines this processor with excellent graphic capability.
We've just begun to tap the possibities of this combination in all areas of programming, including debugging. It still takes too long to go from an idea to a working program. Too much time is spent exorcising bugs. Don't you think it's time our programming tools were a little more sophisticated? It wouldn't surprise me a bit to see a whole new class of debugging programs emerge soon for the Atari ST. And not a minute too soon.
(Editor's note: Jim Dunion is currently putting the finishing touches on STDDT his debugger for the Atari ST. It will be interesting to see if Jim can fit all the features mentioned in this article into STDDT.)
- M68000 16/32-bit Microprocessor Programmer's Reference Manual (fourth edition), by Motorola, Prentice-Hall, Englewood, NJ
Motorola Data Sheets:
- 68000 Assembly Language Programming, by Kane, Hawkins, and Levanthal, Osborne/McGraw-Hill, Berkeley CA
- Programming the M68000, by Tim King and Brian Knight, Addison-Wesley Reading, MA
- The Motorola MC68000 Microprocessor Family, by Thomas L. Harmon and Barbara Lawson, Prentice-Hall, Englewood Cliffs, NJ
- The 68000: Principles and Programming, by Leo J. Scanlon, Howard W Sams & Co., Indianapolis, IN
ASCII Chart - A table showing each keycode's ASCII value. Some debuggers provide this as a convenience for low-level I/O debugging.
Assembler - A program that translates mnemonics meaningful to a programmer to executable object code.
In-line Assembler - A provision allowing a debugger user to deposit values in memory by using mnemonics without having to return to the assembler. Sometimes called an immediate assembler.
Backtracking - The ability to "go backwards" in time and undo the effects of preceding instructions. Requires a buffer to store the preceding instructions and some processor or memory state information.
Breakpoint - A point where execution of the tested program stops and control returns to the debugger.
Conditional Breakpoints - The ability to specify a set of conditions that determine whether a breakpoint will be triggered.
Sticky Breakpoints - Breakpoints that remain in place even after they have been triggered. They must be explicitly cleared.
Calculator - A provision to do some degree of arithmetic or expression evaluation directly in the debugger (e.g. calculating an effective address).
Decimal Arithmetic - Arithmetic capabilities in base 10
Hex Arithmetic - Hexadecimal (base 16) capability
Compare Capability - The ability to compare a given memory range to another memory range. Done properly such a feature will show at what point the comparison fails, if it does.
Communication Ports - The ability to use an external communication port for communicating with the debugger This is usually important for debugging applications that run on systems where there is limited ability to control the screen memory.
Debugger Isolation - Provides some degree of isolation between the debugger and the system under test. This is usually seen in systems with additional memory cards or In - Circuit Emulators.
Disassembler (Unassembler) - A program that converts between object code and assembly language mnemonics. Most debuggers have a provision for displaying a portion of memory in disassembly format.
Fill Memory - The ability to set a range of memory to a specified value or pattern.
Firewalling - A preventative debugging technique where modules are isolated from each other and interact only by affecting a group of variables.
Hardware Assisted Debugger - A debugging system that is provided with some additional hardware. Examples might be extra RAM, a hardware switch that generates on interrupt, or an In-Circuit Emulator system.
Help Screens - Display screens internal to the debugger that explain its functions.
In Circuit Emulators (ICEs) - A hardware assisted debugger that is in fact a complete external computer. Usually includes a cable and integrated-circuit plug-in device that replaces the target processor. In effect, the external computer system "emulates" the microprocessor under test.
Interrupts - Various conditions, or exceptions (as they are called on the 68000), that cause a hardware interrupt to be generated. These cause jumps through a vector table to interrupt processing code.
Linkers - Programs that load object code modules and resolve external symbol references.
Logic Analyzers - An instrument that monitors the busses of microprocessor systems, as well as allowing probes of other locations inside the system. A display much like an oscilloscope is created, showing the logic state over time for the line or point being monitored.
Map Files - Intermediate files produced by some assemblers that detail local symbols, local routines, external symbols and routines that are referenced, etc.
Move Memory - The ability to move a block of memory from one location to another.
Non-Maskable Interrupt - (NMI) an interrupt that can't be ignored by the processor. In the 68000, this can be implemented by setting the interrupt priority to 7.
NMI Switch - A "breakout" type of switch that is wired to generate a Non-Maskable (Highest Priority) interrupt. This is usually intended to return control to the debugger after the program under test has bombed.
Overlays - Additional portions of executable code that are brought in and "overlay" the code currently in memory This is one technique for creating programs that are bigger than the physical memory size.
Patching - The ability to make local temporary changes to object code for quick testing.
Code Insertion - The ability to patch a section of new code without affecting the existing code.
Performance Analyzer - A device or program that captures processor execution information. Typically used to determine where the processor is spending its time, how many times particular routines are called, or specific timing information about a code fragment.
Protected RAM - RAM that is provided with some debuggers where the debugging system can maintain information that is protected from the program under test.
Reduced Speed Execution - Ability to run the program under test in a reduced speed interpretive mode where the action of the processor is slowed down so the user can roughly follow what is happening.
Screen Toggle - Ability to switch back and forth between a user-program screen display and the debug display.
Search Capability - The ability to search through memory for a given value or pattern. Usually updates a display to show the next memory range where a match is found.
Signature Analyzers - A testing device that creates a visual display representing what the processor is doing. Particular programs turn out to have repeatable, easily recognizable patterns.
Single Step - Ability to cause the processor to execute a single instruction and then return control to the debugger.
Single Step Past Calls - Ability to place a breakpoint beyond a subroutine call so that the processor executes a subroutine and then returns to the debugger after returning from the subroutine.
Sleeping Debugging Instructions - Diagnostic instructions or routines that are normally inactive, and which "wake up" and execute when a predefined abnormal condition occurs.
Source Level Debuggers - Debuggers that have some provision for reading source level code files and correlating those with the object code currently being debugged. A display is usually provided that shows the source language statement that contributed the instructions currently being executed. More sophisticated source level debuggers allow for breakpoints to be set at the source level.
Snap Shots - A static representation of the processor and memory state at any instant. Gives the debugger user a picture of what's going on in the processor at that instant.
Symbols - A sequence of characters that stands for either a memory location or a data value. The ability to use symbols makes debugging much easier.
Public Symbols - Symbols that are defined in general function libraries and are available to all program users.
Symbolic Debugging - The ability of the debugger to refer to a symbol table to make disassembly formats closer to assembly or other source languages.
Trace - The ability to monitor the processor and/or memory state after each instruction is executed. This is used for single stepping, conditional breakpoints, creating log files, and backtracking.
Watchpoints - Conditional types of breakpoints where certain conditions are monitored, and if they are satisfied, then control returns to the debugger.
Windows - A portion of a screen display (usually rectangular) that can be set up to monitor particular memory ranges, symbols, or other types of data structures.
Wolf Fence Method - A debugging technique where a bug is located by successively fencing it in smaller and smaller areas of code.