Re-engineering loops for efficiency (ii)

There are a couple more ways of improving loop efficiencies:

Loop unrolling

Every iteration of a loop requires modification and testing of the loop variable. This over- head can be reduced if the number of iterations of the loop is small, say less than five. This is called unrolling the loop. The previous example in loop fusion, can be unrolled, further reducing overhead:

int i, a[1000], b[1000];
for (i=0; i<1000; i=i+2)
{
    a[i] = 0;
    b[i] = 0;
    a[i+1] = 0;
    b[i+1] = 0; 
}

The most extreme case is to eliminate the loop completely and work as sequential line code. For example:

for (i=1; i<=3; i=i+1) {
    x = x + sqrt(i);
}

would arguably be better expressed as:

x = x + sqrt(1) + sqrt(2) + sqrt(3);

removing loop-independent expressions

Expressions in loops that perform a calculation independent of the loop should be evalu- ated outside the loop. This is sometimes known as loop streamlining:

for (i=1; i<=1000; i=i+1)
    sum = sum + pow(x,4.0);

should be replaced by

powX = pow(x,4.0);
for (i=1; i<=1000; i=i+1)
    sum = sum + powX;

We remove the expression pow(x,4.0) because its calculation is independent of the loop and would otherwise be calculated 1000 times. Consider the following nested loop:

for (i=1; i<=100; i=i+1)
    for (j=1; j<=100; j=j+1)
        for (k=1; k<=100; k=k+1)

The nested structure loops 100 × 100 × 100 = 1,000,000 times, meaning that a statement such as pow(x,4.0) would be executed 1 million times. For a nested loop struc- ture, the deeper a loop is nested, the higher the dividend with respect to efficiency. Always optimize inner loops first. The same principle can be applied to nested loops, by moving independent expres- sions from an inner loop outward. For example, some repeated calculations make use of the subscript as part of the calculation:

for (i=1; i<=100; i=i+1)
    for (j=1; j<=100; j=j+1)
        A[i][j] = B[i][j] + c/i;

The calculation c/i is a repeat calculation and should be removed from the inner loop. A better way to code this is

for (i=1; i<=100; i=i+1){
    ci = c / i;
    for (j=1; j<=100; j=j+1)
        A[i][j] = B[i][j] + ci;
}

Now the value c/i is calculated 100 times instead of 10,000 times.

 

Advertisements

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s