Classic Computer Magazine Archive COMPUTE! ISSUE 123 / NOVEMBER 1990 / PAGE G-20

MACHINE LANGUAGE
MORE BIT ANALYSIS

JIM BUTTERFIELD

This month, we'll show how the BIT command may be used to perform certain tests. Here's our project: Given a 6502-based opcode, we want to find out how long the instruction might be.

A pattern in the opcodes allows us to guess the length: If an instruction (in hexadecimal) ends with D, it's a length-3 opcode. But some patterns are not that simple. Opcode $20 (JSR) has length 3, code $30 (BMI) has length 2, and code $40 (RTI) has length 1. The test will need to be constructed carefully.

Standard disassemblers use a lookup table to determine an instruction's length.

The code that follows is more compact, and it shows a new way to use the BIT instruction.

Normally, a programmer would examine specific bits by masking them with AND and then performing a comparison. To continue testing, the original value would need to be loaded again so that a new mask could be applied. The BIT instruction has a built-in AND test that doesn't disturb the values being tested. It's more efficient.

The following program runs on all Commodore 8-bit computers. Assume that the opcode to be analyzed is in the A register. It won't be disturbed during our analysis program; it will still be there when we've finished, and the length value will be in the X register.

First, test specifically for the one instruction that defies the pattern— JSR, opcode $20, with a length of 3.

2045   LDX	#$03    ;may be length 3
2047   CMP	#$20    ;test for $20
2049   BEQ      $2069   ;yes, so we're done

Address $2069 represents the end of our analysis. As you can see above, we've preloaded X with 3—the right value—so we can branch directly to our completion address. Preloading X makes for smooth coding.

Next, we test the opcode in A against a fixed mask of $9F stored at address $2081. (Wouldn't it be nice to have immediate-mode addressing available for the BIT instruction?)

If none of the bits match, the Z flag will be set. Mask $9F has six bits set. The only instructions that will set the Z flag are opcodes $00 (BRK), $40 (RTI), and $60 (RTS). Value $20 would also match, but we've already handled it. When any of the length-1 codes are detected, the program goes to $2069.

204B  LDX   #$01        ;may be length 1
204D  BIT   $2081 	;test against $9F
2050  BEQ   $2069 	;exit if it is

Now we test against a mask of $08 stored at $2082. Only a single bit is set in this number. Which opcodes will it extract? If you wrote the opcode in hexadecimal, you'd see that this coding will identify all instructions whose last digit (in hex) is less than 8. Opcodes like $A2 (LDX), $30 (BMI), $85 (STA), and dozens of others will take this exit with length 2.

2052  LDX   #$02   ;may be length 2
2054  BIT   $2082  ;test against $08
2057  BEQ   $2069  ;exit if it is

About half of the possible opcodes now have been identified. Next, we extract the codes whose hex representation ends in 8 or A. We accomplish this by using a mask of $05, which is stored at $2083.

2059  LDX   #$01    	;may be length 1
205B  BIT   $2083 	;test against $0:
205E  BEQ   $2069 	;exit if it is

All that's left are opcodes ending in (hex) 9, B, C, D, E, and F. Those ending in B and F are not legitimate instructions. The remaining opcodes are length 3, with one important exception. An even first digit (in hex) followed by 9 will be a length-2 instruction. For example, LDA immediate is coded as $A9. We can test for this combination with a mask of $16.

2060     LDX   #$02    	;may be length
2062	 BIT   $2084 	;test against $16
2065	 BEQ   $2069 	;exit if it is
2067	 LDX   #$03   	;else set length
2069	 (analysis is complete)

The BIT intruction came through with stunning elegance and efficiency, takes time and care to get the masks correct and in their most efficient order.

In the accompanying BASIC program, I've added a hex input routine to precede the above code and a brief output routine to follow it.

QR 100 DATA 160, 0, 185, 133, 32, 2, 210, 255, 200, 201
MJ 110 DATA 32, 208, 245, 32, 228, 255, 201, 71, 176, 249, 201
RF 120 DATA 48, 144, 245, 32, 210, 255, 56, 233, 48, 201, 10
GJ 130 DATA 144, 2, 233, 7, 10, 10, 10, 10, 141, 0, 37
EK 140 DATA 32, 228, 255, 201, 71, 176, 249, 201, 48, 144, 245
JR 150 DATA 32, 210, 255, 56, 233, 48, 201, 10, 144, 2, 233, 7
PB 160 DATA 13, 0, 37, 162, 3, 201, 32, 240, 30, 162, 1
GP 170 DATA 44, 129, 32, 240, 23, 62, 2, 44, 130, 32, 240, 16
BX 180 DATA 162, 1, 44, 131, 32, 20, 9, 162, 2, 44, 132, 32
JF 190 DATA 240, 2, 162, 3, 160, 0, 185, 138, 32, 32, 210, 255
KP 200 DATA 200, 201, 58, 208, 24, 138, 9, 48
PS 210 DATA 32, 210, 255, 169, 13, 76, 210, 255
XG 220 DATA 159, 8, 5, 22, 72, 69, 18, 63, 32
HD 230 DATA 61, 76, 69, 78, 58
PP 300 FOR J = 8192 TO 8334
DS 310 READ X: T = T+X
SA 320 POKE J, X : NEXT J
KX 330 IF T<> 16245 THEN STOP
BH 340 SYS 8192