Coding Cobol: Replacing “go to” with a loop

In the previous example we saw how the Cobol program used a go to in order to run a paragraph again. This can be avoided by creating a loop with an exit condition. Here is the code as it stands.

identification division.
program-id. toc2.
data division.
working-storage section.
01 n    pic 9(3).
01 pcnt pic 9V999.

procedure division.
p1.
   display "Enter a value 1->100: "
   accept n.
   if n > 100
     go to p1
   else
      perform p2
   end-if.
   display "Percentage = ", pcnt.
   stop run.
p2.
   compute pcnt = n / 100.0.

Here is the re-envisioned code:

identification division.
program-id. toc2.
data division.
working-storage section.
01 n     pic 9(3).
01 pcnt  pic 9V999.
77 found pic 9.

procedure division.
p1.
   move 0 to found.
   perform until found = 1
      display "Enter a value 1->100: "
      accept n
      if n <= 100
         move 1 to found
      end-if
   end-perform.
   perform p2.
   display "Percentage = ", pcnt.
   stop run.
p2.
   compute pcnt = n / 100.0.

Here a perform statement is used to run a loop until the variable found obtains the value 1 (true). Easy right? Well, wait a minute. The loop may not work so well with other constructs such as reading from a file. Sometimes it may be easier to add another paragraph. For example, change the above code (just the procedure division) to:

procedure division.
p1.
   move 0 to found.
   perform r1 until found = 1.
   perform p2.
   display "Percentage = ", pcnt.
   stop run.
r1.
   display "Enter a value 1->100: "
   accept n
   if n <= 100
      move 1 to found
   end-if.
p2.
   compute pcnt = n / 100.0.

 

Coding Cobol: Replacing “go to” with perform

The use of go to statements in Cobol is probably frowned upon just as much as it is in other languages. Getting rid of them can be a bit of a bugbear because Cobol doesn’t necessarily have equivalent statements to other languages. The best way to fix them is to avoid them, but this isn’t always an option when re-engineering a program. One way of replacing them, in certain circumstances is through the use of a perform statement.

Consider the following Cobol program with a go to statement that basically continues with the “next iteration” of paragraph p1 if the value of n is greater than 100. If it is less than 100, it calculates the percentage as a value between 0 and 1 (by performing paragraph p2.

identification division.
program-id. toc2.
data division.
working-storage section.
01 n    pic 9(3).
01 pcnt pic 9V999.

procedure division.
p1.
   display "Enter a value 1->100: "
   accept n.
   if n > 100
     go to p1
   else
      perform p2
   end-if.
   display "Percentage = ", pcnt.
   stop run.
p2.
   compute pcnt = n / 100.0.

If we replace the go to p1 with perform p1, the program functions as it should. This is because of the structure of the program. The program will cycle until n is less than 100, then it will perform paragraph p2, return to p1, print the value, and stop. There are however situations where this may not work. When the go to statement is the last one is the most optimal. Consider if the procedure division of the program were to look different:

procedure division.
   perform inputp.
   perform calc.
   perform printp.
   stop run.

inputp.
   display "Enter a value 1->100: "
   accept n.
   if n > 100
      go to inputp
      display "dead code"
   end-if.
printp.
   display "Percentage = ", pcnt.
calc.
   compute pcnt = n / 100.0.

This code will work properly, and the phrase “dead code” will never be printed. If we now replace the “go to inputp” with “perform inputp“, the program no longer works properly. Here is what happens when an invalid number is entered first:

Enter a value 1->100:
345
Enter a value 1->100:
37
dead code
Percentage = 0.370

The program still works, but now the phrase “dead code” is printed, because after the user enters 345, control-flow passes to paragraph inputp. Now because the user enters 37, which is an acceptable value, the paragraph “terminates”, and control passes back to where it was previously… the previous run of inputp, which still had the display statement to run (which it does). Now the program moves on to paragraph calc, and then printp.

Nothing in Cobol is ever *that* easy.

Fortran re-engineering: subprograms

In old Fortran programs, the subprograms had parameters, but they were never specified as incoming, or outgoing parameters. Here is an example subroutine which converts decimal hours to hours, minutes and seconds. So an input of 4.45 would produce 4.45 hours = 4 h 26 m 60.00 s (okay, so it isn’t perfect).

      subroutine convrt(dtime,hours,mins,secs)
      integer hours,mins
      real dsecs,dtime,secs

      dsecs = dtime*3600.0
      hours = int(dtime)
      secs = dsecs - 3600.0*hours
      mins = int(secs/60.0)
      secs = secs - 60.0*mins
      end

The parameters to the subroutine are merely declared by type. There is no indication as to whether they are input to the subroutine (in), output (out), or both (inout). In modern versions of Fortran, this is achieved using the intent specifier. It is valid not to use intent, it’s just not good programming practice. The modernized code would look like this:

subroutine convrt(dtime,hours,mins,secs)
   integer, intent(out) :: hours,mins
   real, intent(in) :: dtime
   real, intent(out) :: secs
   real :: dsecs

   dsecs = dtime*3600.0
   hours = int(dtime)
   secs = dsecs - 3600.0*hours
   mins = int(secs/60.0)
   secs = secs - 60.0*mins
end subroutine convrt

Note also that the termination of the subroutine has been modified as well. It is also possible to specify the parameters another (maybe less standard) way:

subroutine convrt(dtime,hours,mins,secs)
   integer :: hours,mins
   real :: dsecs,dtime,secs
   intent(in) :: dtime
   intent(out) :: hours,mins,secs

   dsecs = dtime*3600.0
   hours = int(dtime)
   secs = dsecs - 3600.0*hours
   mins = int(secs/60.0)
   secs = secs - 60.0*mins
end subroutine convrt

 

Fortran re-engineering: loops

Loops in old Fortran were normally constructed using goto statements of one form or another.  An early form of Fortran loop might look like this:

      DO 100 i = 1, 200
	 a(i) = real(i)
  100 b(i) = real(i)**2.0

The 100 is just a label, and the loop body includes the statement next to the label 100. Basically it help creates a “goto” back to the next iteration of the loop.  In Fortran 77 this changed slightly, to a do-continue:

      do 100 i = 1, 200
         a(i) = real(i)
         b(i) = real(i)**2.0
  100 continue

In Fortran 90 this became the loop we see today:

   do i = 1, 100
      a(i) = real(i)
      b(i) = real(i)**2.0
   end do

In reality such loops are some of the easiest constructs to re-engineer.

 

Re-engineering versus Refactoring

When dealing with legacy software, it is important to understand what can be done with the software. Legacy software often consists of software that has been left to run for a long time without too many inherent changes, the “don’t fix what isn’t broken” strategy. As compilers in languages such as Fortran are backwards compatible, it is often possible to compile and run these old programs. Yet at some point it becomes necessary to deal with the old code. So how to is this achieved? Is the code to be re-engineered or refactored?

Re-engineering means making fundamental changes to the code. Here are three core methods of reengineering:

  1. Porting – programs are modified to work on a new hardware platform.
  2. Translation – programs are translated from legacy language to contemporary one.
  3. Migration – programs are converted from a legacy language to a newer dialect.

In essence this is no different to the work that would be done to an old building. It might be moved in its entirety to a new location, it might be completely rebuilt, or it might be made new, incorporating only the facade of the original building.

Refactoring on the other hand, leaves things more intact. Refactoring involves changing a piece of software in such a manner that the external behaviour of the code remains unchanged, but it’s internal structure and architecture are enhanced. This is akin to modernizing the plumbing and electrical system of an old building. It still functions and looks the same way, but the infrastructure has been improved. Refactoring takes control of decaying code, improving the readability and maintainability of existing code. Refactoring is done to fix short-cuts, eliminate duplication and dead code, and to make the design and logic clear. To make better and clearer use of the programming language. It does not necessarily imply that the code is migrated to a new dialect of the language. Refactoring is often a part of the life-cycle of software, and may not be targeted specifically at legacy code.

Reengineering and refactoring look very similar, and there are likely areas, such as migration, where they overlap. In reality the process of dealing with legacy code often begins with refactoring, and progresses to reengineering. In situations where the code base is too complex, it might be worthwhile trying to improve efficiency first by improving algorithms. If this doesn’t work however, reengineering might be in the cards.

Here’s an example of the possibilities when dealing with a legacy, say Fortran IV, piece of code. The refactoring may involve processes such as:

  1. eliminating equivalence statements: specifies that two or more variables or arrays in a program unit share the same memory.
  2. elimination of common blocks: shared, common storage for Fortran programs prior to F90.
  3. removing dead code: code that is never accessed.

refactorReengDIAG

Reengineering on the other hand could involve a port to a new platform, a translation to C, or a migration to Fortran 95.