Fortran Reengineering: arctan (iv)

The last piece of reengineering deals with the remaining two arithmetic-if statements, which include labels 12, 15 and 16. This piece of code is basically a loop, which tests the condition at the end of the loop (so like a do-while loop in C). The original code snipped is shown below (arctan3.f95). All the arithmetic if statements to is terminate the loop if (term-0.0005) is < 0, AND (-term-0.00005) is < 0. Any other combination jumps the program back up to label 12, effecting the next iteration of the loop.

   12 arct=arct+term
      presxp=prevxp+2.0
      term=-prevxp/presxp*y*term
      prevxp=presxp
      if(term-0.00005) 15,12,12
   15 if(-term-0.00005) 16,12,12
   16 return

So this can be reengineered by adding a generic do loop. The only changes made to the statement in the loop are of an aesthetic kind. The two arithmetic if statements are replaced with a single if statement that decides whether the loop should be exited.

   do
      arct = arct + term
      presxp = prevxp + 2.0
      term = -prevxp / presxp * y * term
      prevxp = presxp
      if (term-0.00005 < 0 .and. -term-0.00005 < 0) then
         exit
      end if
   end do

The program is now reengineered. The only thing remaining is to clean up the code, and add some I/O usability to the main part of the program. Basically add in a user prompt, and format the output, giving it units. The reengineered program (arctan4.f95) is shown below.

program testreeng
   implicit none
   real :: x, y
   write(*,*) "Value of arctan to calculate? "
   read(*,*) x
   y = arctanF(x)
   write(*,10) "Arctan(",x,") = ", y, " rad"
10 format(a7,f7.2,a4,f7.4,a4)
   stop

contains

function arctanF(x) result(arct)
   implicit none
   real, intent(in) :: x
   real :: arct
   real :: term, prevxp, presxp, y

   if (x < 0) then
      return
   end if
   arct = 0.0
   if (x-1.0 > 0) then
      term = -1.0/x
      arct = 1.57079
   else
      term = x
   end if
   prevxp = 1.0
   y = term**2.0
   do
      arct = arct + term
      presxp = prevxp + 2.0
      term = -prevxp / presxp * y * term
      prevxp = presxp
      if (term-0.00005 < 0 .and. -term-0.00005 < 0) then
         exit
      end if
   end do
end function arctanF

end program testreeng

Here is a sample run of the program:

 Value of arctan to calculate?
45
Arctan(  45.00) =  1.5486 rad

Fortran Reengineering: arctan (iii)

The major legacy problems associated with the program are four arithmetic-if statements, and a goto statement. So reengineering will involve methodically working through those. From the analysis we determined that the first thing that can be changed is the arithmetic-if associated with labels 2 and 3. The original code to be modified (arctan1.f95) looks like this:

      if(x) 2,3,3
    2 stop
    3 arct=0.0

This code basically says that if x is less than 0, then jump to label 2, otherwise if x is equal to, or greater than 0, then jump to label 3. All the jump does is effectuate an if-else statement. When x is less than 0, the program effectively terminates. So this code can be reimagined as:

   if (x < 0) then
      return
   end if
   arct = 0.0

So when x is less than zero, the program terminates, otherwise it continues, initializing the return value arct. Labels 2 and 3 can be removed. One down, three to go. Next the arithmetic-if associated with labels 5 and 10 can be tackled. The original code to be modified (arctan2.f95) looks like this:

      if(x-1.0) 10,10,5
    5 term=-1.0/x
      arct=1.57079
      go to 11
   10 term=x
   11 prevxp=1.0

Now the code says that if (x-1.0) is less than or equal to 0, jump to label 10, otherwise if greater than 0, jump to label 5. So. For the latter case, when (x-1.0) is greater than 0, it performs the next two statements, then goes to label 11 – all this goto statement does is jump past the portion of code that deals with the case when (x-1.0) is <= 0. The program flow ends up at label 11, regardless of the route taken. So the arithmetic-if can just be changed into an if then/else statement by changing things around. Here is what the re-engineered code looks like:

   if (x-1.0 > 0) then
      term = -1.0/x
      arct = 1.57079
   else
      term = x
   end if
   prevxp = 1.0

This has removed the arithmetic-if and the goto statement in one sweep. The function arctanF() is shown below (remaining legacy code in red). Note how indenting/spacing is modified as the re-engineering occurs.

function arctanF(x) result(arct)
   implicit none
   real, intent(in) :: x
   real :: arct

   real :: term, prevxp, presxp, y

   if (x < 0) then
      return
   end if
   arct = 0.0
   if (x-1.0 > 0) then
      term = -1.0/x
      arct = 1.57079
   else
      term = x
   end if
   prevxp = 1.0
   y = term**2.0

   12 arct=arct+term
      presxp=prevxp+2.0
      term=-prevxp/presxp*y*term
      prevxp=presxp
      if(term-0.00005) 15,12,12
   15 if(-term-0.00005) 16,12,12
   16 return

end function arctanF

Fortran Reengineering: arctan (ii)

The first thing normally done is to convert the code from UPPERCASE to lowercase, and save the code to be reengineered into a new file, arctan0.f95. Now we can perform some refactoring. This normally relates to simple things such as explicitly declaring variables, and adding “implicit none” to the program to avoid implicit variables. The formatting of the program can remain “as is”, and things like spacing and indents can be modified as code is re-engineered. In the first task, we add a contains clause, which sites the function within the program. We also modify the ends of both the main program and function arctanF() to F95 standards. Note that !<…> specifies the remaining code which is omitted for the sake of brevity.

program testreeng
   !<...>
contains

function arctanF(x)
   !<...>
end function arctanF

end program testreeng

Now we modify the function to include a result clause, and an intent specification for the parameter x. The intent(in) clause specifies that the variable x is used to read a real number into the function. The result(arct) clause defines arct as the variable used to hold the value returned from the function.

program testreeng
   !<...>
contains

function arctanF(x) result(arct)
   real, intent(in) :: x
   real :: arct
   !<...>
end function arctanF

end program testreeng

Now we can indicate that “implicit none” should be invoked in both main program and function.

program testreeng
   implicit none
   !<...>
contains

function arctanF(x) result(arct)
   implicit none
   real, intent(in) :: x
   real :: arct
   !<...>
end function arctanF

end program testreeng

Finally the variables in both the main program, and function are explicitly declared.

program testreeng
   implicit none
   real :: x, y
   !<...>
contains

function arctanF(x) result(arct)
   implicit none
   real, intent(in) :: x
   real :: arct
   real :: term, prevxp, presxp, y
   !<...>
end function arctanF

end program testreeng

Once these changes are made, the file can be compiled, and run using “fortran -Wall“. There should be no change to the output, but the warnings have now disappeared.

NOTE: Use a good process for tracking changes. The best way is to modify a version of the file, then when the changes in that file are working, copy it to a new file, and make the next series of changes to the new file. For example, arctan0.f95, arctan1.f95, etc.

Fortran Re-engineering: arctan (i)

As a simple reengineering problem, let’s consider a simple Fortran II program which approximates the arctan() function (there is a built-in function, but this is used to illustrate re-engineering). The two equations below represent the equations used by the program to calculate arctan(x). When 0≤x≤1 equation (1) is used, otherwise if x>1, equation (2) is used.

Below is the Fortran II code (converted to lowercase for readability). It contains a series of arithmetic if statements, and a go to which invariably leads to a certain level of spaghetti code.

      program testreeng
      read(*,*) x
      y=arctanF(x)
      write(*,*) y
      stop
      end

      function arctanF(x)
      if(x) 2,3,3
    2 stop
    3 arctanF=0.0
      if(x-1.0) 10,10,5
    5 term=-1.0/x
      arctanF=1.57079
      go to 11
   10 term=x
   11 prevxp=1.0
      y=term**2.0
   12 arctanF=arctanF+term
      presxp=prevxp+2.0
      term=-prevxp/presxp*y*term
      prevxp=presxp
      if(term-0.00005) 15,12,12
   15 if(-term-0.00005) 16,12,12
   16 return
      end

The first thing to consider is an analysis of the code. Always read the code, don’t be tempted to just jump into the coding, it will only result in problems somewhere down the line. Most of the re-engineering can actually be done at the analysis level. The best way to do this is print out the code, and annotate features that are legacy, and possible remediations, as well as program improvements. Below is an example of how the code above was analyzed.

Reengineering of arctan function written in Fortran II.
The analysis of the legacy code.

Once the program has been reviewed and analyzed, one can proceed with the process of re-engineering the code. If you have not done so already, I suggest reading “A Brief Guide to Fortran Re-engineering“. The code is stored in a file named arctan.for to mark it as a legacy Fortran program. This program will compile and run, it will just produce a series of warnings (always compile with -Wall), of the form:

    9 |       if(x) 2,3,3
      |                                                                        1
Warning: Fortran 2018 deleted feature: Arithmetic IF statement at (1)

Fortran compilers like GNU Fortran are very clear about warning the programmer about features that have been deprecated in new versions of the compiler, although compilers are backwards-compatible (within reason, because some old code contains specific references to hardware devices that may not exist anymore).

Coding Cobol: reading a file of integers

Cobol can be a bit trickier than other programming languages. Part of the issue stems from input from files. It is very stringent about how it deals with data within files. This is partially because input to Cobol program is normally in the form of financial data which has to be very explicit. Nobody wants a program to misinterpret $13987 as $139.87. The program below reads in a list of integers from a file, stores them in an array, and prints them out again. Here is what the input file data.txt looks like:

   56
45009
45001
45000
50619
 9997
   12
  345
 7856
 5768

The program reads in integers which are 5 digits in length. Notice how the input file prepends blanks to fill out integers less than 5 digits in length. This is a normal process for Cobol. Here is the first part of the program, which sets the file control for the input file data.txt, and associated a record with it (input-file). It also declares any variables needed, e.g. an array index, EOF tag, and an array to hold the numbers.

identification division.
program-id. readints.

environment division.
input-output section.
file-control.
select input-file assign to "data.txt"
   organization is line sequential.

data division.
file section.
fd input-file.
01 number-info.
   05 num     pic 9(5).

working-storage section.
77 i          pic 99 value 1.
77 n          pic 99.
77 feof       pic A(1).
01 arr.
   02 array1  pic 9(5) occurs 100 times.

Below is the functioning portion of the program. It consists of the “main” program, and four paragraphs.

  • The main portion of the program opens the file, calls the paragraph read-num, assigns the total number of elements read (n), calls print-nums, and finally closes the file and terminates the program.
  • The paragraph read-num reads a number from the file. If it is the end-of-file, it sets the EOF switch feof to “Y”. If it reads in a number, the paragraph store-element is invoked.
  • The paragraph store-element stores the current number in the array, array1, and increments the array index i.
  • The paragraph print-nums displays all the elements in the array.
procedure division.
   open input input-file.
   perform read-num until feof='Y'.
   compute n = i - 1.
   perform print-nums.
   close input-file.
   stop run.

read-num.
   read input-file
        at end move 'Y' to feof
        not at end perform store-element
   end-read.

store-element.
   move num to array1(i).
   compute i = i + 1.

print-nums.
   perform varying i from 1 by 1 until i > n
      display array1(i)
   end-perform.

There are numerous ways of reading files, here is another version of read-num which doesn’t use a flag.

read-num.
   perform forever
      read input-file
        at end exit perform
        not at end perform store-element
      end-read
   end-perform.

Why do some “technical” books cost so much?

I like books, and recently found some good computer science books I’d like to read. There is just one problem – price. This is an inherent flaw with a series of publishers, but Springer does stand out. Now I understand that publishing books isn’t free. In most technical book contracts, authors get 5-10% royalties, for writing the book (more I guess if they are famous), and the publishers foot the bill for actually putting the book together. So the book I was interested in is “Milestones in Analog and Digital Computing“, by Herbert Bruderer, a 2000 page epitome. Probably some interesting things to read in there. But US$389.00 for the e-Book, and US$499.00 for the hardcover is highway robbery. Literally.

Now people will say that it is expensive writing a book like this. Yeah, but this is the third edition, which is not the same as writing the original book. It was translated from the original in German, “Meilensteine der Rechentechnik“, which sells for US$168. I get translating books costs money too, but by the third edition, you aren’t translating the whole 2000 pages. Maybe I should have done better learning German all those years ago. The problem is who is going to part with US$389.00 for a book? Springer does this quite a lot. Maybe libraries pay their exorbitant prices. I would never buy this book, because there are plenty of other books I can buy for that sort of money.

It shouldn’t surprise me. When I published my book on C programming, “Confessions of a Coding Monkey: Learning the Subtle Art of Programming“, it sold for over C$100 in the bookstore. Students of course always thought I was getting rich on it. But my royalties were 10% on net receipts. That means the price the publisher sells the book to the bookstore. They sold it for something like C$60, so the royalty on one book was C$6. Not really that much. Then of course you have to pay tax on that amount, so on a textbook you buy for $100, the author is lucky if they net $4-5. The bookstore makes C$40+ for doing nothing but stacking a shelf.

Programming isn’t software development

The difference between programming and software development is like the difference between gardening and farming. Farming isn’t just gardening on a larger scale. The techniques using for gardening cannot be used for farming. Any farmer who tries to plant their crops using nothing more than a shovel, hoe, and wheelbarrow would not succeed. Farming needs more powerful tools – a farmer needs a tractor. Gardening isn’t just farming on a small scale. The methods used for farming cannot be used for gardening. A tractor can’t be used to plant a backyard vegetable garden. Yet even though there are differences between farming and gardening, there are fundamental principles that don’t change. Regardless of the size of the effort, plants still need to be provided with adequate nourishment, water, and sunlight. Soil preparation is the same regardless of the size of the plot.

Software development isn’t just programming done over a longer period of time. Different techniques are needed for “small programming” versus “large programming”. Some techniques appropriate for small programming projects aren’t adequate for large projects. Conversely, some of the techniques necessary for large projects are too awkward for small projects. You wouldn’t try and plant a 100 acre farm with wheat using just a hoe and a bag of seed. Nor would you try and plow a 100 square foot backyard garden with a tractor. Unfortunately the same intuition does not hold for programming. Some people have one technique which they use regardless of the size of the project.

When designing a series of basic software tools for image processing in say Julia, it is easy to code them on-the-fly. There is no great planning because the math underpinning them is simple, as is implementing them. For a larger project however, for example developing an app for performing some sort of photograph manipulation on an iPhone, programming-on-the-fly won’t work. Big programs require software development. The problem is often specifying the boundary between what is programming and what is software development. Is a 2,000 line program considered software development? Probably not these days.

Into the depths – infinite recursion

Most languages don’t place any restrictions on how many recursive calls are used, i.e. the depth of recursion. The exception to the rule is Python whose recursion limit is 1000, but it can be changed – needless to say that if you are using recursion, Python is not the best language anyways. In many respects the only thing limiting the use of recursion in languages such as C are limits placed on the system, e.g. the constraints of stack space.

Of course when designing a recursive algorithm, it is a good idea for the recursive function to contain some piece of code that does not depend on recursion, allowing the function to terminate at some point. For example, the following Pascal function contains a line of code that includes no recursive call (on the 3rd line of code). To compute 2.0^4 means calling the function as pow(2.0,4), which makes the following sequence of recursive calls: pow(2.0,3), pow(2.0,2), pow(2.0,1), and pow(2.0,0). The last call terminates at line 3 in the function.

function pow(x:real; n:integer): real;
begin
   if n=0 then pow:=1.0
   else if n>0 then pow:=x*pow(x,n-1)
   else pow:=1/pow(x,-n)
end;

If recursive function has no such terminating clause it is possible that it will go on forever (or at least until resources run out and it crashes). Such a function exhibits what is known as infinite recursion. If the terminating clause of the function pow() were omitted, then the function would become infinitely recursive.

function pow(x:real; n:integer): real;
begin
   if n>0 then pow:=x*pow(x,n-1)
   else pow:=1/pow(x,-n)
end;

Calculating pow(2.0,4), would result in the code eventually crashing. Once the value of n reaches 0, it will continue to invoke the else clause infinitum.

Computer Science Pioneers: Konrad Zuse

When we think of computing pioneers, we invariably think of people like Turing, Dijkstra, or Backus. It’s easy to forget the computing pioneers who weren’t as well known. One such unsung hero is Konrad Zuse (1910-1995). Konrad Zuse was born in Berlin on June 22, 1910. He is recognized for inventing the first fully-functional family of programmable computers (Z1, Z2, Z3 & Z4) which were based on binary logic and floating point arithmetic.

A young Konrad Zuse

Konrad Zuse studied civil engineering at the Berlin’s Technische Hochschule Charlottenburg (now the Technical University of Berlin), graduating in 1935. As civil engineers have to perform copious amounts of calculations, Zuse wondered if this could be done in some automated manner. As existing machines were ill-suited to the task, he decided to find a new way. He had no knowledge of calculating machines, only knowledge that punch-card machines existed. After graduating, Zuse started working as a stress analyzer for the airplane manufacturer Henschel Flugzeugwerke. He stayed a year before resigning in order to build automated calculating machines. Zuse started developing program controlled binary calculators in 1936 in his parents living room. An English translation [2] of Zuse’s first patent application from 1936 shows the scope of the work done by Zuse, even to the point of alluding to arrays and parallel processing.

… I decided to take the binary system because I had the impression that the electromagnetic relay is very well suited for computers [as a way to express] a binary digit.

Schultz, B, Elmauer, E., “Konrad Zuse – An Interview with the Inventor of the World’s First Fully Functional Programmable Digital Computer”, The History of Computing, (Computerworld) pp.35-42 (1981)

By 1938 he had completed the first fully operational mechanical binary computer – the Z1 [1]. The Z1 featured mechanical binary logic and flip-flop memory. It was constructed of thin metal sheets, and programmed by means of a punch tape and punch tape reader. The machine contained a control unit with supervised the execution of instructions, as well as the arithmetic unit, the memory, and the I/O devices. The Z1was capable of floating-point addition and subtraction, with control logic capable of multiplications (by repeated additions) and division (by repeated subtractions) Its original name was “V1” for VersuchsModell 1 (meaning Experimental Model 1), however the name was changed to “Z1” after WW2 to distinguish it from the V1 flying bombs used by the Germans. The original Z1 did not survive the bombing raids on Berlin, a replica was built in 1988-1989 under Zuse’s supervision and now resides at the Museum of Technology in Berlin.

The inner workings of the Z1.

In 1940 he completed work on the Z2, an integer processor built using relays. Zuse’s work was interrupted by the advent of the war, when he was drafted into the army. After six months he was released from military service and went to work for the German Aircraft Research Institute in Berlin. It was here he developed the S1 and S2 machines, likely the first digital computing machines used for factory process control. The S-1 was used to calculate the wing corrections (fluttering) needed for the HS-293 unmanned aircraft developed by the Henschel Aircraft Company. The S-1 would measure inaccuracies in inexpensive wings and rudders built for the HS-293.

Z2’s successor, the Z3 was developed from 1939-1941. It was a machine made entirely of relays, and the world’s first programmable digital computer. It had 1400 electromagnetic relays in memory, 600 to control arithmetic, and a further 600 for other purposes. It featured a binary floating-point numeric system with a word length of 22 bits (and a memory capacity of 64 words). It could perform a multiplication in 3-5 seconds. It was unfortunately destroyed in an air raid in 1944.

The Z3 (a reconstruction) – Deutsches Museum.

In 1942 he started work on the Z4. During the war his workshop was damaged numerous times by Allied bombing raids, forcing Zuse to move the Z4 around Berlin. With all his early machines destroyed, Zuse took the Z4 and left Berlin in March 1945, shortly before the Red Army closed in on Berlin, moving to the Bavarian village of Hinterstein. After the war, Zuse formed a computer company, Zuse KG (Zuse Ingenieurbüro und Apparatebau). It supplied its first computer to Prof. E. Stiefel of the Federal Technical Institute in Zurich in 1948. In 1951 the Z4 was leased to the Technical University of Zurich (ETH), where it was used until 1953. Leitz Optical Works bought the next version, the Z5.

Front view of the operating console of the Z4, with relays in the backgroundDeutsches Museum.

In 1945, Zuse also began work on a language which could be used to express computations. He named this language Plankalkül, which means “program calculus”. Although the work was done in 1945, it was not published until 1972.

It is hard to fathom the genius of Zuse. Unlike his contemporaries, Zuse worked mostly in isolation, with no government funding, no academic support, no real market. He was a risk taker, leaving a job to pursue the dream of creating a computing machine. He then hand-cut 30,000 intricate metal strips and assembled them into a functioning device. Zuse was in essence just someone whose desire was to develop a machine to eliminate the drudgery of copious calculations. How many people today would have the same bravado?

REFS
[1] Rojas R., “Konrad Zuse’s Legacy- The Architecture of the Z1 and Z3”, IEEE Annals of the History of Computing, 19(2), pp.5-16 (1997).
[2] Randell B. (ed.). The Origins of Digital Computers, Springer-Verlag, (1973).
[3] Schultz, B., “It Began in Berlin… In his Parents’ Living Room”, The History of Computing (Computerworld) pp.42-44 (1981)
[4] Konrad Zuse – the first relay computer, History of Computers.

Coding Fortran: nested functions

Many programming languages offer the ability to nest functions, e.g. Pascal, Ada, and even Fortran (even GNU C supposedly extends C to allow for nested functions). There are distinct benefits to the process, including the ability to encapsulate functionality, and passing fewer parameters. Fortran, (Fortran-90), supports a single level of nested subroutines and functions, using the contains clause. Here is an example of a Fortran 95 program containing a subroutine with two nested functions (the code exists sequentially in one file, it is just broken up here for clarity. The program uses Heron’s formula to calculate the area of a triangle. First there is the code for the main program. It reads in three values which represent the three sides (lengths) of a triangle. Then it calls the subroutine heron().

program nested
   implicit none
   real :: a, b, c

   write(*,*) 'Three sides of the triangle: '
   read(*,*) a, b, c
   write(*,*) 'Input sides are ', a, b, c
   call heron(a,b,c)

end program nested

Next is the subroutine heron(). It contains two nested functions: triangleTest() and area(). The function triangleTest() determines if the three values entered by the user actually constitute a valid triangle, and if they do then the function area() can be called to perform the area computation.

subroutine heron(a,b,c)
  implicit none
  real, intent(in) :: a, b, c
  real :: areaTriangle

   if (triangleTest()) then
      areaTriangle = area()
      write(*,*) 'Triangle area is ', areaTriangle
   else
      write(*,*) 'Not a triangle'
   end if

contains

logical function triangleTest()
   logical :: t1, t2
   t1 = ((a .gt. 0.0) .AND. (b .gt. 0.0) .AND. (c .gt. 0.0))
   t2 = ((a+b .gt. c) .AND. (a+c .gt. b) .AND. (b+c .gt. a))
   triangleTest = t1 .AND. t2
end function triangleTest

real function area()
   real :: s
   s = (a + b + c) / 2.0
   area = sqrt(s*(s-a)*(s-b)*(s-c))
end function area

end subroutine heron

Not the the variables a, b, and c do not have to be passed to the nested functions, as they are in the scope of heron(), so have access to the variables.