How efficient is your code?

How often do you think about how efficient your code is? Coding on an average computer is easy because of all the resources at hand. Lots of disk space, lots of memory. The average programmer probably doesn’t think much about efficiency until they have to work on software for mobile devices and the like. Sure solid state memory is quite good on these devices, but let’s face it battery life isn’t, and that’s the crux of many apps – too complicated and they will churn over the processor, which in turn will eat up the battery. This is more so if one uses the camera a lot. Imagine stepping back to the early 1970s and having to develop code for the Voyager deep space probes.

The hardware on both voyagers had 69.63 kilobytes of memory. The processors in the Flight Data System could manage 81,000 operations a second. To put that into context, the A13 Bionic chip in side many of Apples newest phones can manage 5 trillion operations per second. One device is over 21 billion kilometres from earth, the other doesn’t move anywhere (or really do much of anything).

Fortran re-engineering: subprograms

In old Fortran programs, the subprograms had parameters, but they were never specified as incoming, or outgoing parameters. Here is an example subroutine which converts decimal hours to hours, minutes and seconds. So an input of 4.45 would produce 4.45 hours = 4 h 26 m 60.00 s (okay, so it isn’t perfect).

      subroutine convrt(dtime,hours,mins,secs)
      integer hours,mins
      real dsecs,dtime,secs

      dsecs = dtime*3600.0
      hours = int(dtime)
      secs = dsecs - 3600.0*hours
      mins = int(secs/60.0)
      secs = secs - 60.0*mins
      end

The parameters to the subroutine are merely declared by type. There is no indication as to whether they are input to the subroutine (in), output (out), or both (inout). In modern versions of Fortran, this is achieved using the intent specifier. It is valid not to use intent, it’s just not good programming practice. The modernized code would look like this:

subroutine convrt(dtime,hours,mins,secs)
   integer, intent(out) :: hours,mins
   real, intent(in) :: dtime
   real, intent(out) :: secs
   real :: dsecs

   dsecs = dtime*3600.0
   hours = int(dtime)
   secs = dsecs - 3600.0*hours
   mins = int(secs/60.0)
   secs = secs - 60.0*mins
end subroutine convrt

Note also that the termination of the subroutine has been modified as well. It is also possible to specify the parameters another (maybe less standard) way:

subroutine convrt(dtime,hours,mins,secs)
   integer :: hours,mins
   real :: dsecs,dtime,secs
   intent(in) :: dtime
   intent(out) :: hours,mins,secs

   dsecs = dtime*3600.0
   hours = int(dtime)
   secs = dsecs - 3600.0*hours
   mins = int(secs/60.0)
   secs = secs - 60.0*mins
end subroutine convrt

 

How to convert a decimal fraction to binary

In the last post we looked at why 1/10 can not be represented precisely in digital form. It has to do with its binary representation, so how to we do that? It’s not that hard really. The fraction 1/10 becomes the decimal 0.1.

To convert fraction to binary, start with the fraction in question and multiply it by 2 keeping notice of the resulting integer and fractional part. Continue multiplying by 2 until you get a resulting fractional part equal to zero. Then just write out the integer parts from the results of each multiplication. So 0.1 becomes:

0.1 × 2 = 0 + 0.2
0.2 × 2 = 0 + 0.4
0.4 × 2 = 0 + 0.8
0.8 × 2 = 1 + 0.6
0.6 × 2 = 1 + 0.2
0.2 × 2 = 0 + 0.4
0.4 × 2 = 0 + 0.8
0.8 × 2 = 1 + 0.6

So you can see the pattern here, right? The value of 0.1 as a binary number becomes 0.00011001… You can see that the 1001 part becomes a recurring series infinitum, meaning that when converted back to a real number, the value will become imprecise. Let’s try another number: 0.6875

0.6875 × 2 = 1 + 0.375
0.375 × 2 = 0 + 0.75
0.75 × 2 = 1 + 0.5
0.5 × 2 = 1 + 0

So the binary value of the decimal number 0.6875 is 0.1011. It’s when the binary fraction is converted back to decimal that the imprecision appears. To convert a binary fraction to decimal, start from the right with the total of 0. Take your current total, add the current digit and divide the result by 2. Continue until there are no more digits left. Let’s process the binary number 0.00011001.

0.5 × (1 + 0) = 0.5
0.5 × (0 + 0.5) = 0.25
0.5 × (0 + 0.25) = 0.125
0.5 × (1 + 0.125) = 0.5625
0.5 × (1 + 0.5625) = 0.78125
0.5 × (0 + 0.78125) = 0.390625
0.5 × (0 + 0.390625) = 0.1953125
0.5 × (0 + .1953125) = 0.09765625

So the end result is that the value 0.1 is returned as 0.097, which is close. Obviously, the more  1001 repetitions you convert, the closer it gets.

 

 

Fortran re-engineering: arithmetic if

The arithmetic-if, if you haven’t already encountered it, is one of the precursors of modern decision statements. If is essentially a goto in disguise. It is a three-way branch to labelled statements depending on the value of a real or integer expression: (i) less than zero, (ii) equal to zero, (iii) greater than zero. Two of the labels may be the same to construct a two-way branch. Here is an example:

      IF (x-y) 10, 20, 30
   10 CALL YBIG
      GO TO 100
   20 CALL EQUAL
      GO TO 100
   30 CALL XBIG
      GO TO 100

Here’s how it works:

  • If x–y is less than zero, the IF branches to label 10.
  • If x–y is equal to zero, the IF branches to label 20.
  • If x–y is greater than zero, the IF branches to label 30.

At the end of each segment, the program branches to label 100. The arithmetic-if  is considered an obsolete structure, so it’s best to convert it to a block if and remove all traces of the labels. Ironically it was declared obsolescent in Fortran 90, but is still a standard. The arithmetic-if can easily be converted to a modern structure:

if (x-y < 0) then
   call Ybig
elseif (x-y == 0) then
   call Equal
else
   call Xbig
end if

 

 

Fortran re-engineering: loops

Loops in old Fortran were normally constructed using goto statements of one form or another.  An early form of Fortran loop might look like this:

      DO 100 i = 1, 200
	 a(i) = real(i)
  100 b(i) = real(i)**2.0

The 100 is just a label, and the loop body includes the statement next to the label 100. Basically it help creates a “goto” back to the next iteration of the loop.  In Fortran 77 this changed slightly, to a do-continue:

      do 100 i = 1, 200
         a(i) = real(i)
         b(i) = real(i)**2.0
  100 continue

In Fortran 90 this became the loop we see today:

   do i = 1, 100
      a(i) = real(i)
      b(i) = real(i)**2.0
   end do

In reality such loops are some of the easiest constructs to re-engineer.

 

Floating-point precision in Fortran

Fortran is a numeric beast, which is exactly why its still popular amongst scientific programmers. Languages like C are still somewhat constrained when it comes to numerical precision. In C, a long double gets you…

Fortran 90 and beyond provides a kind parameter which can be used to specify precision of both reals and integers. Fortran uses the function selected_real_kind() to create a parameter of real type for kind. Inputs to this are precision and range. Here is an example:

integer, parameter :: dp = selected_real_kind(15, 307)
real (kind=dp) :: x, y

This produces a parameter, dp, used to define a double precision real with 15 significant digits, and an exponent range or 307. Single and quad precision can also be defined in a similar way:

integer, parameter :: sp = selected_real_kind(6, 37)
integer, parameter :: qp = selected_real_kind(33, 4931)

In Fortran 2008, the constants real32, real64, and real128 can be used instead of selected_real_kind(). How does this work? Consider the following piece of Fortran code to calculate the repeating decimal 1.0/7.0. The answer should be 0.142857 142857… Note that the floating point numbers used have to be tagged as a specific kind (e.g. 1.0_dp), otherwise the outcome can be erroneous.

integer, parameter :: dp = selected_real_kind(15, 307)
real (kind=dp) :: d
d = 1.0_dp / 7.0_dp

What happens when we run it  in all three precisions?

   SP = 0.14285714924335479736328125000000000
   DP = 0.14285714285714284921269268124888185
   QP = 0.14285714285714285714285714285714285

The red numbers show the precision of the number, with correct digits. The blue numbers are those correct beyond the precision. Now compare this to C’s long double which is typically 80-bit extended precision. Here’s the C code:

long double d;
d = 1.0L / 7.0L;

Here’s the result:

0.1428571428571428571409210675491330277964

 

 

 

Is Fortran a weed, or an invasive species?

Alan Perlis once said that “Fortran is not a flower, but a weed. It is hardy, occasionally blooms, and grows in every computer“. There was a time when this statement was quite true, and while the use of Fortran has lessened over the years, it is still quite a relevant programming language. I think Perlis did not call Fortran a weed in any derogatory manner, but rather touting its ability to spread quickly, as it did in the period from its inception to probably the mid 1980s. Fortran still provides a good vehicle for people to learn how to program, and excels at mathematical calculations.

The world of numeric errors

One of the greatest sources of logic errors occurs with the use of floating-point numbers. More often than not, these errors occur during the process of logical comparisons. Some of these errors occur because programmers forget that floating-point arithmetic is imprecise. Some of the errors which can occur in floating-point computations include: rounding errors (e.g. 2.0/3.0 yields 0.666667), absorptions errors, overflow, underflow, cancellation errors or invalid operations.

Consider some of the following properties of floating-point arithmetic. Arithmetic using the floating point number system has two important properties that differ from those of arithmetic using real numbers. Floating point arithmetic is not associative. This means that in general for floating point numbers x, y, and z:

(x + y) + z !=  x + (y + z)
(x . y) . z != x . (y . z)
(x . y) / z != x . (y / z)

Floating point arithmetic is also not distributive. This means that in general:

x . (y + z) != (x . y) + (x . z)

In short, the order in which operations are carried out can change the output of a floating point calculation. So consider the following example in Fortran:

program numericerror
   integer, parameter :: dp = selected_real_kind(15, 307)
   real (kind=dp) :: result1, result2, x, y, z

   x = 2.49334_dp
   y = 3.712358_dp
   z = 1.377416_dp
   result1 = (x * y) / z
   result2 = x * (y / z)

   print *, result1
   print *, result2

end program numericerror

When this program runs, one would assume that the value of result1 would be the same as result2. But this is not the case. Here are the two values:

6.7199529377617218
6.7199529377617226

 

Why 1/10 is not a precise real number

You may think that 1/10 is quite a precise number, I mean it’s 0.1 isn’t it? Easy to convert, one-tenth right? Wrong. Only some numbers, e.g. those  with a denominator containing a factor of 5, can be precisely represented. All others are imprecise. So this is understandable for numbers such as , because the answer to this is 0.3333333…. recurring infinitum. The more 3’s you add to the end, the more accurate the approximation becomes. But it is still an approximation. Less easy to see with , because the answer here is 0.1. The problem lies, NOT with the numbers themselves, but in the conversion to binary. Yes, all numbers have to be converted to binary in the inner workings of the CPU because that’s how they are processed. 

Unfortunately, most decimal fractions cannot be represented exactly as binary fractions. A consequence is that, in general, the decimal floating-point numbers we enter are only approximated by the binary floating-point numbers actually stored in the machine. Now 1/10 cannot be represented accurately because no matter how many base 2 digits you’re willing to use, the decimal value 0.1 cannot be represented exactly as a base 2 fraction.  In base 2, 1/10  is the infinitely repeating fraction:

0.0001100110011001100110011001100110011001100110011…

You can see a repeating base of 1001. Stop at any finite number of bits, and you get an approximation.  This is why we see things like: 

0.10000000000000001 or 0.0999999999999998

The consequence of this is that summing ten values of 0.1, may not exactly yield 1.0. Consider the following example in Fortran (using double precision reals):

program one_tenth
implicit none

   integer, parameter :: dp = selected_real_kind(15, 307)
   real (kind=dp) :: frac, sum
   integer :: i

   frac = 1.0_dp / 10.0_dp
   print *, frac
   do i = 1,10
      sum = sum + frac
   end do

   print *, sum
end program one_tenth

Now here is the output produced:

0.10000000000000001
0.99999999999999989

Case in point.

 

 

 

 

Fortran re-engineering: data statements

The Fortran data statement is a means of initiating values in a variable. For example the statement below assigns the value 3.7 to the variable x.

real :: x
data x/3.7/

It is a very handy method of initializing arrays:

integer, dimension(10) :: aray
data aray/1,2,3,4,5,6,7,8,9,10/

There is nothing inherently wrong with the data statement, but it can easily be replaced with an array initializing statement of the form:

aray = (/1,2,3,4,5,6,7,8,9,10/)

It is not inherently a legacy feature though.