Anti-Hesitation Programming: A Tutorial On Arrays

Anti-Hesitation Programming: A Tutorial On Arrays

M. R. Smith
Calgary, Alberta, Canada

Editor's Note: The delays discussed and corrected in this article are a problem common to Microsoft BASICs (Apple, PET/CBM, OSI, etc.). Because the Atari has a different variable storage format, no hesitation is observed using the structure of Program 1. Atari BASIC, though, is similar to Microsoft with respect to GOTO—it searches the program for the target from top to bottom. And the time-saving effect of relocating REMs can be seen on the Atari.—RTM

Have you ever had a series of hesitations or pauses occuring at the start of your BASIC program? It is particularly obvious when using loops or subroutines. First time into a FOR…NEXT loop, the program seems to hiccup and pause. Thoughts of the dreaded infinite loop occur, but then the program seems to recover. The second time into the loop, the response is so fast that the screen smokes. What causes this alteration in behaviour?

To demonstrate the effect, enter and run Program 1:

 1 REM PROGRAM #1
20 PRINT "LINE 20": DIM A(500), B(500), C(500)
30 PRINT "LINE 30"
40 FOR H = 1 TO 5:I = 1
50 J = 1: K = 1: PRINT "LINE 50"
60 NEXT I
70 FOR I = 1 TO 5: PRINT "LINE 70"
80 L = 1: M = 1: P = 1: PRINT "LINE 80"
90 NEXT I : STOP

You'll notice a pause between line 20 and line 30. More pauses occur before lines 50 and 80. However, the next four times that the program gets to these lines, there is no pause.

On adding just one statement, line 10, to this program, you'll notice a real difference.

 1 REM JPROGRAM # 2
10 H = 0 : I = 0 : J = 0 : K = 0 : REM INITIALIZE VARIABLES
20 PRINT "LINE 20" : DIM A(500), B(500), C(500)
30 PRINT "LINE 30"
40 FOR H = 1 TO 5 : I = 1
50 J = 1 : K = 1 : PRINT "LINE 50"
60 NEXT I
70 FOR I = 1 TO 5 : PRINT "LINE 70"
80 L = 1 : M = 1 : P = 1 : PRINT "LINE 80"
90 NEXT I : STOP

In this version, the pause before line 50 has disappeared. This change occurs because the simple variables, H, I, J and K, are names in line 10 of the program. This means that these variables are used before any of the arrays, A(500), B(500), C(500) are made.

To explain why all this is occurs, you have to understand how a BASIC interpreter stores things in the computer memory. In the middle of a program (say line 90 of Program 1), memory is split up like this:

----------- BOTTOM
PROGRAM
-----------
SIMPLE VARIABLES
-----------
ARRAYS
-----------
UNUSED
-----------
CHARACTER ARRAYS
--------TOP

For each variable, array or string used in the program, there is a definite place reserved in memory.

Before we ran the program, things looked a lot simpler.

------BOTTOM
PROGRAM
------
UNUSED
-----TOP

After line 20 in Program 1, things were different yet again.

------BOTTOM
PROGRAM
------
A(500)
B(500)       ARRAYS
C(500)
------
UNUSED
---------TOP

The first pause in the program, before line 30, occurred while the arrays were being set up. The second pause occurred when the variables H and I were used for the first time. After line 40, the memory allocation was like this.

---------BOTTOM
PROGRAM
---------
H
I        SIMPLE VARIABLES
---------
A(500)
B(500)   ARRAYSC(500)
-------
UNUSED
-------TOP

To make room for the variable H, the BASIC interpreter had to first move the arrays A(), B() and C() higher up in memory. Then it had to move these arrays again to find room for variable I. During line 50, the arrays needed to be moved twice more; first for variable J and then to place variable K. All this movement caused the second pause. The more there is to move and the more variables there to place, the longer the pause will be.

The second time around in the FOR…NEXT loop, places for the variables H to J were already available in memory, so no more pauses occurred. The pauses, however, started again when the arrays had to be moved three times to provide room for the variables L, M and P in line 80.

In BASIC, each time a simple variable is used for the first time, all the arrays then in existence have to be moved up in memory. This causes a pause in the execution of the program. If a large number of variables is introduced, these pauses can accumulate into a sizeable delay. To avoid the pauses, we have to initialize (that means establish) all simple variables before we introduce any arrays.

To understand how this improves things, consider the memory after line 20 in Program 2. It looked like this:

-------BOTTOM
PROGRAM
-------
H
I      SIMPLE VARIABLES
J
K
-------
A(500)
B(500)  ARRAYS
C(500)
-------
UNUSED
-------TOP

This is very different to the appearance of the memory after line 20 of Program 1. When the program reaches line 40, the variables H to J will have already been fitted into memory, so that the arrays will not need to be moved. Therefore the pauses will vanish. At line 80, new variables will again have to be placed in memory, which means a pause while all the arrays move over. You can see the advantage of predefining all the simple variables before the arrays.

Systematic Initialization

Taking a systematic approach to the initialization of variables in a program can prevent a lot of problems. Program 2, rewritten for systematic initialization, might look something like this:

 1 REM PROGRAM #2 NEW
10 GOSUB 60000: REM DO INITIALIZATON
20 PRINT "LINE 20
30 PRINT "LINE 30
40 FOR H = 1 TO 5: I = 1
50 = 1 : K = 1: PRINT "LINE 50"
60 NEXT I
70 FOR I = 1: PRINT "LINE 50"
60 NEXT I
70 FOR I = 1: PRINT "LINE 70"
80 L = 1: M = 1 : P = 1: PRINT "LINE 80"
90 NEXT I: STOP
59990 REM
60000 REM INITIALIZE SIMPLE VARIABLES
60010                REM VARIABLES A - E
60020 H = 0: I = 0: J = 0: REM VARIABLES F - J
60030 K = 0 : L = 0: M = 0: REM VARIABLES K - O
60040 P = 0                :REM VARIABLES P - T
60050                       REM VARIABLES U - Z
60100 REM INITIALIZE ARRAYS
60110 DIM A(500), B(500), C(500)
60200 RETURN

This does seem to overdo things for such a short program, but this approach does have advantages for long programs.

1) Use a subroutine for initialization.

There is an obscure advantage of doing initialization using a subroutine. You could put equivalent statements to 60000 - 60200 at the beginning of a program. The advantage lies in the way that the BASIC interpreter handles GOSUB and GOTO commands. When a GOSUB command occurs, most BASIC interpreters skip to the beginning of the program. They then look at every line number (including those of REM statements) trying to find the line number wanted. Suppose that statements which are used only once in a program are placed at its start. There would be a tremendous waste of time while the interpreter unsuccessfully looks at these lines each time it searches for the line number it wants. Placing these lines at the end of the program makes for a great and simple way of speeding up your programs. This is particularly true when a GOTO command is issued from the middle of a FOR…NEXT loop near the end of the program.

The effect can be demonstrated by using the following program.

1 REM PROGRAM #3
10 PRING "LINE 10"
20 REM
30 REM
…..keep inserting statements until you have about
   40 REM's
430 REM
440 FOR J = 1 TO 2500
450 GOTO 470
460 GOTO 480

This does seem to overdo things for such a short program, but this approach does have advantages for long programs.

1) Use a subroutine for initialization.

There is an obscure advantage of doing initialization using a subroutine. You could put equivalent statements to 60000 - 60200 at the beginning of a program. The advantage lies in the way that the BASIC interpreter handles GOSUB and GOTO commands. When a GOSUB command occurs,most BASIC interpreters skip to the beginning of the program. They then look at every line number (including those of REM statements) trying to find the line number wanted. Suppose that statements which are used only once in a program are placed at its start. There would be a tremendous waste of time while the interpreter unsuccessfully looks at these lines each time it searches for the line number it wants. Placing these lines at the end of the program makes for a great and simple way of speeding up your programs. This is particularly true when a GOTO command is issued from the middle of a FOR…NEXT loop near the end of the program.

The effect can be demonstrated by using the following program.

 1 REM PROGRAM #3
10 PRINT "LINE 10" #3
20 REM
30 REM
…..Keep inserting statements until you have about 40 REM's
430 REM
440 FOR J = 1 TO 2500
450 GOTO 470
460 GOTO 480470 GOTO 460
480 NEXT J
490 PRINT "490"

This is a short timing loop involving three inter-linked GOTO statements. Measure the time it takes for the program to go between the two PRINT statements using the second hand of your watch. Now remove the REM statements and place them at the end of your program. Time again and notice the difference.

On my APPLE, the timing was 28 seconds with the REM's at the beginning compared to eight seconds with the REM's at the end. Quite a time saving. Shifting the initialization statements of your program can have the same effect. This also works the other way. If you have a subroutine that you use often, then place that at the beginning of the program. That way the BASIC interpreter can find it quickly.

2) List the variables in groups.

The main advantage of grouping the variables (A to E, F to I, etc.) on separate lines is that it becomes easy to determine if a variable has already been used.

It is not as obvious as you might think to determine whether or not a variable has already been used in a program. Consider a long program which uses variable YES at its beginning, and variable YEAR near its end. Many BASIC interpreters consider (since these two variables have the same two starting letters) that they must both be equal to the variable YE. This means that, although you intended the two variables to be different, they are actually being treated as the same game by the interpreter. Spotting a conflict like this can absorb a lot of time. However, if you put all variables in one location, then you are more likely to spot possible conflicts in names.

Declaring (initializing) all the variables at the beginning of a program can decrease the number of strange pauses in the middle of a program. It also decreases the chance of accidentally getting two independent variables with the same name.