Machine Language Scanning the Stack
Jim Butterfield, Toronto
The stack is a group of locations from hexadecimal 0100 to 01FF that can be quickly and conveniently used by the 6502. In the PET, the stack range is limited to the area from 0140 to 01FA; but most of the time you don't need to know where the stack is working: you may just use it.
When you have something that needs keeping for a few moments, you can put it on the stack and call it back later. So long as you're neat, you don't even need to know where it will go – the processor keeps track of that with a special register called a Stack Pointer. It will put information away to the stack and bring it back later without any special information from you.
But you must be neat. The slogan "Leave these premises as clean as you found them" applies critically to the way you use the stack. If you put something in there, you must be sure to call it back or you'll be in trouble.
The stack is appropriately named. It's like a stack of dishes: the first thing that comes off will be the last thing that was put on. It's called LIFO (Last-in-first-out) storage.
If you have a value in the A register that you want to put aside for a moment or so, you can push it to the stack using the PHA (48) instruction. Now you can use the A register for something else, and when you're finished you can call back the original value by pulling it from the stack with PLA (68).
Sometimes you might want to defer a decision. You've just done a comparison or some other activity, and the results are important – but you don't want to act on those results just yet. You can push the status word – all the various flags, such as Carry, Overflow, etc. – to the stack with PHP (08). Now you can tidy up your registers without worrying about losing those flags. They will come back as soon as you give PLP (28) and you can then proceed with the Branch commands that will test the condition you previously set up.
When you call a subroutine with a JSR command, the stack is called into play automatically. The return address, minus one, is placed on the stack. Later, when the RTS is given, that address is called back from the stack and program execution resumes at the instruction following the JSR.
An example here might be worth while. If you are at location hex 1234, and give the instruction JSR $4455, the address 1236 will be placed on the stack. That's not your return point – you'll return to 1237 since the JSR command is 3 bytes long – but the RTS instruction will sort everything out correctly. Here's a little more detail: when the address 1236 is placed on the stack it will use two locations. The high-order part (12) goes onto the stack first, followed by the low-order portion (36).
When the 6502 receives an interrupt – and on the PET this happens 60 times a second – the current machine language instruction completes; the address of the next instruction is pushed to the stack; and finally, the processor Status Word is pushed to the stack. Then the processor starts to handle the interrupt by going to a new location and executing instructions there. When it's finished, it gives a Return from Interrupt instruction (RTI) which restores the original Status Word and instruction address. The original program picks up exactly where it left off when it was interrupted.
You can see that three locations are used in the stack this time: two for the return address and one for the status word. They go onto the stack in that order: address-high, address-low, and status.
It's a little like a JSR followed by a PHP, since we store address and status word. Note, however, that the address is the exact return address; with a JSR the address is one less than the return address.
An example: if the processor is executing a three-byte instruction at hex 1234 and an interrupt is signalled, address 1237 is pushed to the stack, followed by the status word. Later, when RTI is executed, the status word is restored and execution resumes at address 1237.
Finally, the BRK instruction (hex 00) causes an interrupt type of action, with this difference: the address which is placed on the stack is two locations behind the Break instruction. This is odd, since the BRK command is only one byte long. In this case, if we use an RTI to continue executing the code following the BRK, we'll skip one byte.
The following table summarizes the instructions that handle the stack.
|Number of bytes stored or recalled
Here's where it gets interesting. "Ordinary" programming assumes that you use the companion instruction to restore the stack. That is, if you used a PHA you should use a PLA to bring the information back. If you used a JSR, you should use an RTS.
But by breaking this unwritten law, we can do some pretty fancy things. We must be careful, of course.
The Fun Begins
Suppose you are writing a subroutine. Normally, you'll want to return control to the calling point by giving the RTS command. On occasion, however, you don't want to go back; perhaps there's an error in the data so that the calling routine couldn't continue
We can handle this. Just pop the return address from the stack with two PLA commands, and you'll never go back.
A little more detail on the tricks we can use here. Suppose you have a main routine at A, which calls a subroutine at B. Subroutine B, in turn, calls subroutine C several times. Subroutine C, which might for example be digging out a parameter for the SAVE command, decides it doesn't want to go back to subroutine B for some reason; perhaps there are no more parameters left (e.g., SAVE "PGM" instead of SAVE "PGM", 1,2). In this case, it wants to go straight back to the main routine A.
When subroutine C is called, the stack will contain four values: two for the return address to A, and two for the return address to B. If subroutine C executes: PLA PLA RTS, it will throw away the return to B and go straight back to A.
Want to find out if you're in Decimal mode? It's unusual in the PET, but the 6502 processor can switch to a special mode for addition and subtraction. There's a flag in the Status word that signals this. You could try a sample addition and see whether the result is calculated in decimal or not: CLC - LDA #$05 - ADC #$05 .. the result will be hexadecimal 0A if you're in binary mode, and hexadecimal 10 if you are in decimal mode.
There's a more straightforward way. Push the Status Word to the stack, and pull it back to the A register - execute PHP, PLA. You can now examine the bits of the Status Word at your leisure. Decimal mode is flagged in bit 3; you could mask it with AND #$08, for example.
The Computed Jump
You can jump to any single location you choose by using the JMP instruction. There are times, however, when you want to jump to one of several locations depending on some value you have calculated. For example, you might be writing a system which would jump to one routine if it detected an A (add) character; another routine for D (delete); a third for C (change); and so on.
This could be done, of course, with a series of compare, branch and jump instructions; but if the list is long, the whole thing becomes tedious and inefficient.
You can set up the equivalent of a very powerful computed jump by clever use of the stack. The principle is to manufacture an address; push it to the stack with PHA ... PHA; and then give RTS.
This seems puzzling at first. How can you return to a place you never came from? It works this way: by pushing the address to the stack, you simulate a non-existent subroutine call. The stack doesn't care. If you issue an RTS instruction, the stack will deliver up that address, and that's where you will go. The stack ends up unchanged: it has pushed two values and delivered them back.
Remember that the RTS instruction expects the address to be one lower than the real return address. If you want to go address hex 3456, you must push the values 34 and 55.
A quick example may help illustrate this powerful technique. Suppose the X register contains a value from 0 to 5. Depending on this value, we wish to jump to one of six different locations. We have built the destination addresses into a set of address tables, each with six entries. The low order part of the addresses are in a table starting at hex 2320 and the high order part of the addresses are in a second table starting at hex 2326. We've carefully remembered to subtract one from each address, and the table looks like the following:
2320 41 72 A3 C4 E5 F6 2326 24 25 27 29 2B 2C
If X contains zero, we want to jump to hex 2442; if one, we go to 2573; and so on. Let's do it.
BD 26 23 LDA $2326, X ; high order first 48 PHA BD 20 23 LDA $2320, X ; low order last 48 PHA 60 RTS ; go there
It's easy, it's fast, it's compact, and it's one of the most powerful tricks in the repertoire of the 6502 programmer. Microsoft Basic uses it to get to the various routines - PRINT, LET, FOR ... etc. The Machine Language Monitor uses it to interpret its commands - .M, .R, and so on.
The Stack Pointer
There are a couple of commands that tell you where the stack is working, or allow you to control where it is. You won't need to use them very often, but a little detail is worth while.
The stack works in a downwards direction. As you push things, the stack pointer gets lower. As you pull them back, the pointer goes back up. If the stack pointer gets down to its bottom value, pointing at address hex 0100, it will wrap around to 01FF if you try pushing more things in; but your program will be in serious trouble long before you reach this point.
The stack pointer indicates the next location that will be used. If the pointer has a value of 92, you know that the next value that you push will go into address 0192; or the next value that you pull will come from address 0193.
The command TXS - Transfer X to Stack Pointer, hex 9A - is most often used to reset the stack completely. This cancels everything: subroutines, interrupts, the whole works. On most PETs, you should set the pointer to hex FA with: LDX #$FA, TXS.
The command TSX – Transfer Stack Pointer to X, hex BA – is used for a couple of things.
You can check to see if you have too many things on the stack by coding TSX, CPX #$40 ... use whatever pointer limit you think is reasonable, but hex 40 is about as low as you should ever allow it to get.
The stack is in memory, of course; so you can look through the stack directly by examining the contents of locations 0100 to 01FF. You'll need to know where to start, of course, and TSX comes in handy here. If you give TSX followed by LDA $0100,X you will load the location to which the stack points. That may not be too useful, since it's the next location to be filled, and isn't part of the "active" stack. By incrementing X, however, or by looking higher with an instruction such as LDA $0101,X you can get to whatever part of the stack interests you.
Most of the time, the stack will take care of itself. Occasionally, however, you'll find that digging a little deeper into the mechanics of the stack can make it possible to do some very effective coding.