The last piece of reengineering deals with the remaining two arithmetic-if statements, which include labels 12, 15 and 16. This piece of code is basically a loop, which tests the condition at the end of the loop (so like a do-while loop in C). The original code snipped is shown below (arctan3.f95). All the arithmetic if statements to is terminate the loop if (term-0.0005) is < 0, AND (-term-0.00005) is < 0. Any other combination jumps the program back up to label 12, effecting the next iteration of the loop.
So this can be reengineered by adding a generic do loop. The only changes made to the statement in the loop are of an aesthetic kind. The two arithmetic if statements are replaced with a single if statement that decides whether the loop should be exited.
do
arct = arct + term
presxp = prevxp + 2.0
term = -prevxp / presxp * y * term
prevxp = presxp
if (term-0.00005 < 0 .and. -term-0.00005 < 0) then
exit
end if
end do
The program is now reengineered. The only thing remaining is to clean up the code, and add some I/O usability to the main part of the program. Basically add in a user prompt, and format the output, giving it units. The reengineered program (arctan4.f95) is shown below.
program testreeng
implicit none
real :: x, y
write(*,*) "Value of arctan to calculate? "
read(*,*) x
y = arctanF(x)
write(*,10) "Arctan(",x,") = ", y, " rad"
10 format(a7,f7.2,a4,f7.4,a4)
stop
contains
function arctanF(x) result(arct)
implicit none
real, intent(in) :: x
real :: arct
real :: term, prevxp, presxp, y
if (x < 0) then
return
end if
arct = 0.0
if (x-1.0 > 0) then
term = -1.0/x
arct = 1.57079
else
term = x
end if
prevxp = 1.0
y = term**2.0
do
arct = arct + term
presxp = prevxp + 2.0
term = -prevxp / presxp * y * term
prevxp = presxp
if (term-0.00005 < 0 .and. -term-0.00005 < 0) then
exit
end if
end do
end function arctanF
end program testreeng
Here is a sample run of the program:
Value of arctan to calculate?
45
Arctan( 45.00) = 1.5486 rad
The major legacy problems associated with the program are four arithmetic-if statements, and a goto statement. So reengineering will involve methodically working through those. From the analysis we determined that the first thing that can be changed is the arithmetic-if associated with labels 2 and 3. The original code to be modified (arctan1.f95) looks like this:
if(x) 2,3,3
2 stop
3 arct=0.0
This code basically says that if x is less than 0, then jump to label 2, otherwise if x is equal to, or greater than 0, then jump to label 3. All the jump does is effectuate an if-else statement. When x is less than 0, the program effectively terminates. So this code can be reimagined as:
if (x < 0) then
return
end if
arct = 0.0
So when x is less than zero, the program terminates, otherwise it continues, initializing the return value arct. Labels 2 and 3 can be removed. One down, three to go. Next the arithmetic-if associated with labels 5 and 10 can be tackled. The original code to be modified (arctan2.f95) looks like this:
if(x-1.0) 10,10,5
5 term=-1.0/x
arct=1.57079
go to 11
10 term=x
11 prevxp=1.0
Now the code says that if (x-1.0) is less than or equal to 0, jump to label 10, otherwise if greater than 0, jump to label 5. So. For the latter case, when (x-1.0) is greater than 0, it performs the next two statements, then goes to label 11 – all this goto statement does is jump past the portion of code that deals with the case when (x-1.0) is <= 0. The program flow ends up at label 11, regardless of the route taken. So the arithmetic-if can just be changed into an if then/else statement by changing things around. Here is what the re-engineered code looks like:
if (x-1.0 > 0) then
term = -1.0/x
arct = 1.57079
else
term = x
end if
prevxp = 1.0
This has removed the arithmetic-ifand the goto statement in one sweep. The function arctanF() is shown below (remaining legacy code in red). Note how indenting/spacing is modified as the re-engineering occurs.
function arctanF(x) result(arct)
implicit none
real, intent(in) :: x
real :: arct
real :: term, prevxp, presxp, y
if (x < 0) then
return
end if
arct = 0.0
if (x-1.0 > 0) then
term = -1.0/x
arct = 1.57079
else
term = x
end if
prevxp = 1.0
y = term**2.0
12 arct=arct+term
presxp=prevxp+2.0
term=-prevxp/presxp*y*term
prevxp=presxp
if(term-0.00005) 15,12,12
15 if(-term-0.00005) 16,12,12
16 return
end function arctanF
The first thing normally done is to convert the code from UPPERCASE to lowercase, and save the code to be reengineered into a new file, arctan0.f95. Now we can perform some refactoring. This normally relates to simple things such as explicitly declaring variables, and adding “implicit none” to the program to avoid implicit variables. The formatting of the program can remain “as is”, and things like spacing and indents can be modified as code is re-engineered. In the first task, we add a contains clause, which sites the function within the program. We also modify the ends of both the main program and function arctanF() to F95 standards. Note that !<…> specifies the remaining code which is omitted for the sake of brevity.
program testreeng
!<...>
contains
function arctanF(x)
!<...>
end function arctanF
end program testreeng
Now we modify the function to include a result clause, and an intent specification for the parameter x. The intent(in) clause specifies that the variable x is used to read a real number into the function. The result(arct) clause defines arct as the variable used to hold the value returned from the function.
program testreeng
!<...>
contains
function arctanF(x) result(arct)
real, intent(in) :: x
real :: arct
!<...>
end function arctanF
end program testreeng
Now we can indicate that “implicit none” should be invoked in both main program and function.
program testreeng
implicit none
!<...>
contains
function arctanF(x) result(arct)
implicit none
real, intent(in) :: x
real :: arct
!<...>
end function arctanF
end program testreeng
Finally the variables in both the main program, and function are explicitly declared.
program testreeng
implicit none
real :: x, y
!<...>
contains
function arctanF(x) result(arct)
implicit none
real, intent(in) :: x
real :: arct
real :: term, prevxp, presxp, y
!<...>
end function arctanF
end program testreeng
Once these changes are made, the file can be compiled, and run using “fortran -Wall“. There should be no change to the output, but the warnings have now disappeared.
NOTE: Use a good process for tracking changes. The best way is to modify a version of the file, then when the changes in that file are working, copy it to a new file, and make the next series of changes to the new file. For example, arctan0.f95, arctan1.f95, etc.
As a simple reengineering problem, let’s consider a simple Fortran II program which approximates the arctan() function (there is a built-in function, but this is used to illustrate re-engineering). The two equations below represent the equations used by the program to calculate arctan(x). When 0≤x≤1 equation (1) is used, otherwise if x>1, equation (2) is used.
Below is the Fortran II code (converted to lowercase for readability). It contains a series of arithmetic if statements, and a go to which invariably leads to a certain level of spaghetti code.
program testreeng
read(*,*) x
y=arctanF(x)
write(*,*) y
stop
end
function arctanF(x)
if(x) 2,3,3
2 stop
3 arctanF=0.0
if(x-1.0) 10,10,5
5 term=-1.0/x
arctanF=1.57079
go to 11
10 term=x
11 prevxp=1.0
y=term**2.0
12 arctanF=arctanF+term
presxp=prevxp+2.0
term=-prevxp/presxp*y*term
prevxp=presxp
if(term-0.00005) 15,12,12
15 if(-term-0.00005) 16,12,12
16 return
end
The first thing to consider is an analysis of the code. Always read the code, don’t be tempted to just jump into the coding, it will only result in problems somewhere down the line. Most of the re-engineering can actually be done at the analysis level. The best way to do this is print out the code, and annotate features that are legacy, and possible remediations, as well as program improvements. Below is an example of how the code above was analyzed.
The analysis of the legacy code.
Once the program has been reviewed and analyzed, one can proceed with the process of re-engineering the code. If you have not done so already, I suggest reading “A Brief Guide to Fortran Re-engineering“. The code is stored in a file named arctan.for to mark it as a legacy Fortran program. This program will compile and run, it will just produce a series of warnings (always compile with -Wall), of the form:
9 | if(x) 2,3,3
| 1
Warning: Fortran 2018 deleted feature: Arithmetic IF statement at (1)
Fortran compilers like GNU Fortran are very clear about warning the programmer about features that have been deprecated in new versions of the compiler, although compilers are backwards-compatible (within reason, because some old code contains specific references to hardware devices that may not exist anymore).
Cobol can be a bit trickier than other programming languages. Part of the issue stems from input from files. It is very stringent about how it deals with data within files. This is partially because input to Cobol program is normally in the form of financial data which has to be very explicit. Nobody wants a program to misinterpret $13987 as $139.87. The program below reads in a list of integers from a file, stores them in an array, and prints them out again. Here is what the input file data.txt looks like:
56
45009
45001
45000
50619
9997
12
345
7856
5768
The program reads in integers which are 5 digits in length. Notice how the input file prepends blanks to fill out integers less than 5 digits in length. This is a normal process for Cobol. Here is the first part of the program, which sets the file control for the input file data.txt, and associated a record with it (input-file). It also declares any variables needed, e.g. an array index, EOF tag, and an array to hold the numbers.
identification division.
program-id. readints.
environment division.
input-output section.
file-control.
select input-file assign to "data.txt"
organization is line sequential.
data division.
file section.
fd input-file.
01 number-info.
05 num pic 9(5).
working-storage section.
77 i pic 99 value 1.
77 n pic 99.
77 feof pic A(1).
01 arr.
02 array1 pic 9(5) occurs 100 times.
Below is the functioning portion of the program. It consists of the “main” program, and four paragraphs.
The main portion of the program opens the file, calls the paragraph read-num, assigns the total number of elements read (n), calls print-nums, and finally closes the file and terminates the program.
The paragraph read-num reads a number from the file. If it is the end-of-file, it sets the EOF switch feof to “Y”. If it reads in a number, the paragraph store-element is invoked.
The paragraph store-element stores the current number in the array, array1, and increments the array index i.
The paragraph print-nums displays all the elements in the array.
procedure division.
openinput input-file.
perform read-num until feof='Y'.
compute n = i - 1.
perform print-nums.
close input-file.
stoprun.
read-num.
read input-file
atend move 'Y' to feof
notatend perform store-element
end-read.
store-element.
move num to array1(i).
compute i = i + 1.
print-nums.
performvarying i from 1 by 1 until i > n
display array1(i)
end-perform.
There are numerous ways of reading files, here is another version of read-num which doesn’t use a flag.
I like books, and recently found some good computer science books I’d like to read. There is just one problem – price. This is an inherent flaw with a series of publishers, but Springer does stand out. Now I understand that publishing books isn’t free. In most technical book contracts, authors get 5-10% royalties, for writing the book (more I guess if they are famous), and the publishers foot the bill for actually putting the book together. So the book I was interested in is “Milestones in Analog and Digital Computing“, by Herbert Bruderer, a 2000 page epitome. Probably some interesting things to read in there. But US$389.00 for the e-Book, and US$499.00 for the hardcover is highway robbery. Literally.
Now people will say that it is expensive writing a book like this. Yeah, but this is the third edition, which is not the same as writing the original book. It was translated from the original in German, “Meilensteine der Rechentechnik“, which sells for US$168. I get translating books costs money too, but by the third edition, you aren’t translating the whole 2000 pages. Maybe I should have done better learning German all those years ago. The problem is who is going to part with US$389.00 for a book? Springer does this quite a lot. Maybe libraries pay their exorbitant prices. I would never buy this book, because there are plenty of other books I can buy for that sort of money.
It shouldn’t surprise me. When I published my book on C programming, “Confessions of a Coding Monkey: Learning the Subtle Art of Programming“, it sold for over C$100 in the bookstore. Students of course always thought I was getting rich on it. But my royalties were 10% on net receipts. That means the price the publisher sells the book to the bookstore. They sold it for something like C$60, so the royalty on one book was C$6. Not really that much. Then of course you have to pay tax on that amount, so on a textbook you buy for $100, the author is lucky if they net $4-5. The bookstore makes C$40+ for doing nothing but stacking a shelf.
The difference between programming and software development is like the difference between gardening and farming. Farming isn’t just gardening on a larger scale. The techniques using for gardening cannot be used for farming. Any farmer who tries to plant their crops using nothing more than a shovel, hoe, and wheelbarrow would not succeed. Farming needs more powerful tools – a farmer needs a tractor. Gardening isn’t just farming on a small scale. The methods used for farming cannot be used for gardening. A tractor can’t be used to plant a backyard vegetable garden. Yet even though there are differences between farming and gardening, there are fundamental principles that don’t change. Regardless of the size of the effort, plants still need to be provided with adequate nourishment, water, and sunlight. Soil preparation is the same regardless of the size of the plot.
Software development isn’t just programming done over a longer period of time. Different techniques are needed for “small programming” versus “large programming”. Some techniques appropriate for small programming projects aren’t adequate for large projects. Conversely, some of the techniques necessary for large projects are too awkward for small projects. You wouldn’t try and plant a 100 acre farm with wheat using just a hoe and a bag of seed. Nor would you try and plow a 100 square foot backyard garden with a tractor. Unfortunately the same intuition does not hold for programming. Some people have one technique which they use regardless of the size of the project.
When designing a series of basic software tools for image processing in say Julia, it is easy to code them on-the-fly. There is no great planning because the math underpinning them is simple, as is implementing them. For a larger project however, for example developing an app for performing some sort of photograph manipulation on an iPhone, programming-on-the-fly won’t work. Big programs require software development. The problem is often specifying the boundary between what is programming and what is software development. Is a 2,000 line program considered software development? Probably not these days.
Most languages don’t place any restrictions on how many recursive calls are used, i.e. the depth of recursion. The exception to the rule is Python whose recursion limit is 1000, but it can be changed – needless to say that if you are using recursion, Python is not the best language anyways. In many respects the only thing limiting the use of recursion in languages such as C are limits placed on the system, e.g. the constraints of stack space.
Of course when designing a recursive algorithm, it is a good idea for the recursive function to contain some piece of code that does not depend on recursion, allowing the function to terminate at some point. For example, the following Pascal function contains a line of code that includes no recursive call (on the 3rd line of code). To compute 2.0^4 means calling the function as pow(2.0,4), which makes the following sequence of recursive calls: pow(2.0,3), pow(2.0,2), pow(2.0,1), and pow(2.0,0). The last call terminates at line 3 in the function.
function pow(x:real; n:integer): real;
begin
if n=0 then pow:=1.0
else if n>0 then pow:=x*pow(x,n-1)
else pow:=1/pow(x,-n)
end;
If recursive function has no such terminating clause it is possible that it will go on forever (or at least until resources run out and it crashes). Such a function exhibits what is known as infiniterecursion. If the terminating clause of the function pow() were omitted, then the function would become infinitely recursive.
function pow(x:real; n:integer): real;
begin
if n>0 then pow:=x*pow(x,n-1)
else pow:=1/pow(x,-n)
end;
Calculating pow(2.0,4), would result in the code eventually crashing. Once the value of n reaches 0, it will continue to invoke the else clause infinitum.
When we think of computing pioneers, we invariably think of people like Turing, Dijkstra, or Backus. It’s easy to forget the computing pioneers who weren’t as well known. One such unsung hero is Konrad Zuse (1910-1995). Konrad Zuse was born in Berlin on June 22, 1910. He is recognized for inventing the first fully-functional family of programmable computers (Z1, Z2, Z3 & Z4) which were based on binary logic and floating point arithmetic.
A young Konrad Zuse
Konrad Zuse studied civil engineering at the Berlin’s Technische Hochschule Charlottenburg (now the Technical University of Berlin), graduating in 1935. As civil engineers have to perform copious amounts of calculations, Zuse wondered if this could be done in some automated manner. As existing machines were ill-suited to the task, he decided to find a new way. He had no knowledge of calculating machines, only knowledge that punch-card machines existed. After graduating, Zuse started working as a stress analyzer for the airplane manufacturer Henschel Flugzeugwerke. He stayed a year before resigning in order to build automated calculating machines. Zuse started developing program controlled binary calculators in 1936 in his parents living room. An English translation [2] of Zuse’s first patent application from 1936 shows the scope of the work done by Zuse, even to the point of alluding to arrays and parallel processing.
… I decided to take the binary system because I had the impression that the electromagnetic relay is very well suited for computers [as a way to express] a binary digit.
Schultz, B, Elmauer, E., “Konrad Zuse – An Interview with the Inventor of the World’s First Fully Functional Programmable Digital Computer”, The History of Computing, (Computerworld) pp.35-42 (1981)
By 1938 he had completed the first fully operational mechanical binary computer – the Z1 [1]. The Z1 featured mechanical binary logic and flip-flop memory. It was constructed of thin metal sheets, and programmed by means of a punch tape and punch tape reader. The machine contained a control unit with supervised the execution of instructions, as well as the arithmetic unit, the memory, and the I/O devices. The Z1was capable of floating-point addition and subtraction, with control logic capable of multiplications (by repeated additions) and division (by repeated subtractions) Its original name was “V1” for VersuchsModell 1 (meaning Experimental Model 1), however the name was changed to “Z1” after WW2 to distinguish it from the V1 flying bombs used by the Germans. The original Z1 did not survive the bombing raids on Berlin, a replica was built in 1988-1989 under Zuse’s supervision and now resides at the Museum of Technology in Berlin.
The inner workings of the Z1.
In 1940 he completed work on the Z2, an integer processor built using relays. Zuse’s work was interrupted by the advent of the war, when he was drafted into the army. After six months he was released from military service and went to work for the German Aircraft Research Institute in Berlin. It was here he developed the S1 and S2 machines, likely the first digital computing machines used for factory process control. The S-1 was used to calculate the wing corrections (fluttering) needed for the HS-293 unmanned aircraft developed by the Henschel Aircraft Company. The S-1 would measure inaccuracies in inexpensive wings and rudders built for the HS-293.
Z2’s successor, the Z3 was developed from 1939-1941. It was a machine made entirely of relays, and the world’s first programmable digital computer. It had 1400 electromagnetic relays in memory, 600 to control arithmetic, and a further 600 for other purposes. It featured a binary floating-point numeric system with a word length of 22 bits (and a memory capacity of 64 words). It could perform a multiplication in 3-5 seconds. It was unfortunately destroyed in an air raid in 1944.
In 1942 he started work on the Z4. During the war his workshop was damaged numerous times by Allied bombing raids, forcing Zuse to move the Z4 around Berlin. With all his early machines destroyed, Zuse took the Z4 and left Berlin in March 1945, shortly before the Red Army closed in on Berlin, moving to the Bavarian village of Hinterstein. After the war, Zuse formed a computer company, Zuse KG (Zuse Ingenieurbüro und Apparatebau). It supplied its first computer to Prof. E. Stiefel of the Federal Technical Institute in Zurich in 1948. In 1951 the Z4 was leased to the Technical University of Zurich (ETH), where it was used until 1953. Leitz Optical Works bought the next version, the Z5.
Front view of the operating console of the Z4, with relays in the background – Deutsches Museum.
In 1945, Zuse also began work on a language which could be used to express computations. He named this language Plankalkül, which means “program calculus”. Although the work was done in 1945, it was not published until 1972.
It is hard to fathom the genius of Zuse. Unlike his contemporaries, Zuse worked mostly in isolation, with no government funding, no academic support, no real market. He was a risk taker, leaving a job to pursue the dream of creating a computing machine. He then hand-cut 30,000 intricate metal strips and assembled them into a functioning device. Zuse was in essence just someone whose desire was to develop a machine to eliminate the drudgery of copious calculations. How many people today would have the same bravado?
REFS [1] Rojas R., “Konrad Zuse’s Legacy- The Architecture of the Z1 and Z3”, IEEE Annals of the History of Computing, 19(2), pp.5-16 (1997). [2] Randell B. (ed.). The Origins of Digital Computers, Springer-Verlag, (1973). [3] Schultz, B., “It Began in Berlin… In his Parents’ Living Room”, The History of Computing (Computerworld) pp.42-44 (1981) [4] Konrad Zuse – the first relay computer, History of Computers.
Many programming languages offer the ability to nest functions, e.g. Pascal, Ada, and even Fortran (even GNU C supposedly extends C to allow for nested functions). There are distinct benefits to the process, including the ability to encapsulate functionality, and passing fewer parameters. Fortran, (Fortran-90), supports a single level of nested subroutines and functions, using the contains clause. Here is an example of a Fortran 95 program containing a subroutine with two nested functions (the code exists sequentially in one file, it is just broken up here for clarity. The program uses Heron’s formula to calculate the area of a triangle. First there is the code for the main program. It reads in three values which represent the three sides (lengths) of a triangle. Then it calls the subroutine heron().
program nested
implicit none
real :: a, b, c
write(*,*) 'Three sides of the triangle: '
read(*,*) a, b, c
write(*,*) 'Input sides are ', a, b, c
call heron(a,b,c)
end program nested
Next is the subroutine heron(). It contains two nested functions: triangleTest() and area(). The function triangleTest() determines if the three values entered by the user actually constitute a valid triangle, and if they do then the function area() can be called to perform the area computation.
subroutine heron(a,b,c)
implicit none
real, intent(in) :: a, b, c
real :: areaTriangle
if (triangleTest()) then
areaTriangle = area()
write(*,*) 'Triangle area is ', areaTriangle
else
write(*,*) 'Not a triangle'
end if
contains
logical function triangleTest()
logical :: t1, t2
t1 = ((a .gt. 0.0) .AND. (b .gt. 0.0) .AND. (c .gt. 0.0))
t2 = ((a+b .gt. c) .AND. (a+c .gt. b) .AND. (b+c .gt. a))
triangleTest = t1 .AND. t2
end function triangleTest
real function area()
real :: s
s = (a + b + c) / 2.0
area = sqrt(s*(s-a)*(s-b)*(s-c))
end function area
end subroutine heron
Not the the variables a, b, and c do not have to be passed to the nested functions, as they are in the scope of heron(), so have access to the variables.