# Re-engineering loops for efficiency (ii)

There are a couple more ways of improving loop efficiencies:

### Loop unrolling

Every iteration of a loop requires modification and testing of the loop variable. This over- head can be reduced if the number of iterations of the loop is small, say less than five. This is called unrolling the loop. The previous example in loop fusion, can be unrolled, further reducing overhead:

```int i, a[1000], b[1000];
for (i=0; i<1000; i=i+2)
{
a[i] = 0;
b[i] = 0;
a[i+1] = 0;
b[i+1] = 0;
}```

The most extreme case is to eliminate the loop completely and work as sequential line code. For example:

```for (i=1; i<=3; i=i+1) {
x = x + sqrt(i);
}
```

would arguably be better expressed as:

```x = x + sqrt(1) + sqrt(2) + sqrt(3);
```

### removing loop-independent expressions

Expressions in loops that perform a calculation independent of the loop should be evalu- ated outside the loop. This is sometimes known as loop streamlining:

```for (i=1; i<=1000; i=i+1)
sum = sum + pow(x,4.0);
```

should be replaced by

```powX = pow(x,4.0);
for (i=1; i<=1000; i=i+1)
sum = sum + powX;
```

We remove the expression pow(x,4.0) because its calculation is independent of the loop and would otherwise be calculated 1000 times. Consider the following nested loop:

```for (i=1; i<=100; i=i+1)
for (j=1; j<=100; j=j+1)
for (k=1; k<=100; k=k+1)
```

The nested structure loops 100 × 100 × 100 = 1,000,000 times, meaning that a statement such as pow(x,4.0) would be executed 1 million times. For a nested loop struc- ture, the deeper a loop is nested, the higher the dividend with respect to efficiency. Always optimize inner loops first. The same principle can be applied to nested loops, by moving independent expres- sions from an inner loop outward. For example, some repeated calculations make use of the subscript as part of the calculation:

```for (i=1; i<=100; i=i+1)
for (j=1; j<=100; j=j+1)
A[i][j] = B[i][j] + c/i;
```

The calculation c/i is a repeat calculation and should be removed from the inner loop. A better way to code this is

```for (i=1; i<=100; i=i+1){
ci = c / i;
for (j=1; j<=100; j=j+1)
A[i][j] = B[i][j] + ci;
}```

Now the value c/i is calculated 100 times instead of 10,000 times.