Classic Computer Magazine Archive START VOL. 5 NO. 6 / FEBRUARY/MARCH 1991

PROGRAMMING IN C

C FOR SPEED!

Techniques To Transform Your Turtle Code

BY JOHN ALLEN

The C programming language is famous for its ability to produce speedy and efficient code, but it still can't beat assembly, right? Wrong! If you optimize your code using these techniques, you can get assembly-fast results without losing the power of C.

I have three categories for measuring the speed of a computer program: the first one is the programming time, the time it takes you to write the code; the second is the time it takes to modify the program if the user's requirements change; and the third is the actual time required for the program to execute. Many of the programmer's decisions speed up one category at the expense of another. For example, you can save time by not commenting the program, but that slows down subsequent modifications. If, on the other hand, you spend enormous quantities of time documenting a program and squeezing every cycle out of it, you may not finish before your competitor does. Obviously, you must balances your efforts.

Competing With Assembly
Because they dictate every detail of execution, the very best assembly-language programmers will always produce faster-than-C code. However, good programming techniques combined with exploitation of the power of the C language enables you to write code that is easy to understand and modify, yet runs as fast as stuff written by all but the most experienced assembly programmers. They may have the edge on execution time, but you'll probably spend much less time in the development stage than they do.

There are two phases of code optimization. The first is to eliminate unnecessary run-time cycles by using efficient algorithms. The second is to accomplish each of the necessary operations in as little time as possible without sacrificing too much development time or code clarity.

Read your favorite C
reference book and
familiarize yourself with
the #define macros.


Watch Your Algorithms
The algorithmic methods of code optimization can be used in any language, and you probably already use them to some extent. Here's an example:
   for (i=0; i<(1000; i++){
      a=b+c;
      x[i]= a*i;
      }

The values stored in the variables b and c do not change, therefore the value stored in the variable a does not change after the first iteration. Consequently, the variable a only needs to be set once:
   a = b+c;
   for (i=0; i<1000; i+ +){
   x[i]=a*i;
   }

Be careful not to use unnecessary variables. For example:
   j=0;
   for (i=0; <1000; i++){
      x[i]=i;
      y[i]=j++;
      }

Why increment two variables each loop iteration when one will do? Try this instead:
   for (i=0;i<1000; i++)
      x[i]=i;
      y[i]=i
      }

Another technique is to eliminate common sub-expressions, for example:
   a =(x*y)+1;
   b = 2*(x*y)+3;
   c = a+b+(x*y);

See how x and y are multiplied each time. Looking up the value of a variable is faster than re-multiplying!
   Temp1 = x*y;
   a = templ+1;
   b = 2*temp1+3;
   c = a+b+temp1;

Also, beware of using variables which are really constants.
   a = 1;
   b = c+a;
   d = a+1;

Use a constant instead.
   a = 1;
   b = c+l;
   d=2;

This rule also applies for the following:
   a = 1+2;

which, of course should he
   a = 3;

Most compilers will catch this simple case, but you might as well do it yourself, just in case the compiler doesn't.

Another thing to look for is code like this:
   #define MAXSIZE 5
   a = MAXSIZE+b-1;
 
It would be faster to do:
   #define MAXSIZE               5
   #define MAXSIZEMINUSONE 4
   a = MAXSIZEMINUSONE+b;

However, this makes it harder to modify later, because if MAXSIZE is changed you also must change MAXSIZEMINUSONE. It would probably not he a good trade-off if that line is executed only once, but if it is inside a loop that exeutes 10,000 times, it can really make a difference.

Unrolling Loops
If you are trying to squeeze every last ounce of speed, you can unroll your loops, but this may cost you development or modification time if the number of times the loop is executed has to change. Also, it is more difficult to do this when the loop is controlled by user input. Here is an example of unrolling:
   for (i=0; i<3; i++)
      a[i] =i

becomes
   a[0]=0;
   a[1]= 1;
   a[2] = 2;

This eliminates the increment for each loop iteration, and substitutes an explicit operation for the memory accesses on the variable i.

If unrolling of loops is not practical, perhaps you can combine two or more loops like this:
   for (i=0; i<l000; i++)
      a[i] =i
   for (i=0; i<1000; i++)
      b[i]=i;
into a single loop:
   for (i=0; <1000; i++)
      a[i] = b[i]=i;

You just eliminated one increment and one test instruction for each loop iteration,

All of the optimizations I have shown so far are programming techniques that, for the most part, work in any language. Now we will explore two C-specific techniques, which may also he adapted, with some effort, to other languages.

Using Macros
In many cases a piece of code is broken up into functions to eliminate typing duplicate code and to make things easily modifiable and readable. However, the use of macros will do the same thing, and, since they arc resolved at compile time, they require absolutely no run-time overhead! There is one thing to remember about using #define macros. When passing arguments to macros, whatever you give as an argument will be literally replaced when the macro is expanded. For example:
   #define squared(x) (x*x)
   a = 2;
   b = squared(a++);
is replaced with
   a = 2;
   b = a++*a++;

This may not be what you intended!

The most common use of macros is single-line macros, but C does not limit you to that. Lines that end with the backslash character (\) are combined with the next line, then both the backslash character and the following line feed are deleted during pre-processing:
   #define test_macro(a,b,c,array)  \
               inti;                                \
               for (i=0; i<c; i++)           \
                  a+ = array[i,b];            \

will become
#define test_macro(a,b,c,array) int i for (i=0; i<c; i++) = array[i,b];

Remember that macros are substituted directly into the source file by the pre-processor. You cannot call a macro from itself, or from another macro that it calls, because recursion is not possible. Read your favorite C reference book a couple of times to familiarize yourself with #define macros. Once you understand how macro substitution works, you can avoid the run-time overhead of some function calls.

Shifty Bits
Another way to speed up execution is to take a trick from assembly programmers and fake your powers-of-two division and multiplication:
   5*2 = 10 is, in binary, 0101* 0010 = 1010;
   12/2 = 6 is, in binary, 1100 / 0010 = 0110.

Notice that 10 in binary (1010) is just five shifted to the left by one bit (010l). Also, six is just 12 shifted to the right by one bit. A shift operation is very fast compared to mathematical operations such as multiplication and division. of course, this only works for powers of two, but it :an still help considerably. Here is a straightforward way to divide or multiply integers by two:
   #define divide_2(a)  (a>>1)
   #define mult_2(a)     (a<<1)

The first technique is to
use efficient algorithms.

For division or multiplication by four it would be:
   #define divide_4(a)      (a>>2)
   #define mult_4(a)         (a<<2)

In a more general fashion, you can write macros to divide or multiply by powers of two, where the first argument is the number and the second is the divisor (2. 4, 8, etc.):
   #define divide_2p(a,p)    (a>>p)
   #define mulL_2p(a,p)     (a<<p)

If you divide an odd integer such as five by two using mathematical division, you will get the real number 2.5. If you use the shifting method you will get the integer two; the remainder is lost. This is the same as the DIV operation in Pascal. Remember that shifting numbers in lieu of multiplication or division only works for integers, not for floating point numbers.

Power Tools
These tools can he very powerful when used properly but like all tools, they are only as efficient as the person using them. Being efficient takes some skill and a lot of practice - so practice! This is certainly not a complete discussion on how to optimize programs; the art of programming is too complex to allow such a thing in one article. But that same complexity leaves a lot of room for improvement, and I hope that the presentation of these techniques encourages you to further develop your own skills.

John Allen started programming in C when he got an original 520ST with TOS-on-disk back in 1985.