The process of re-engineering code is not an exact science – the sequence of steps will be different for each particular program, and has much to to with the make-up of the code than the algorithm itself. Is the code modular? Does the code contain a number of unstructured branches, (i.e. goto statements)? The presence or absence of the latter may determine how challenging the re-engineering process will be. Is there a basic series of steps which could be used to re-engineer a program? The series of steps below are given in the context of re-engineering Fortran (from Fortran 77 or prior), but could easily be adapted to any language.
0. Analyze the program
The first thing to do, before even trying to compile the legacy program is to try and analyze what it does. Is there documentation that came with it? Does the program have comments? A flow chart perhaps? If there is nothing but the program, you are going to have to do some detective work first. Walk through the code and see if you can create a flow chart – it may be best to try and visualize what the program is doing, especially if it has a few unstructured jumps.
1. Try compiling the program
The assumption is that the program will run, but that may not always be the case. Compilers like gfortran are backwards compatible – to a point. Some programs come from proprietary compilers, or ones that were not commonly used, and so may contain code idiosyncrasies. In other cases, the code may contain references to devices such as input card readers, which have to be dealt with before the program will even compile. Once the program is running, test data can be derived – this data will help assure the progress of the re-engineering process as it progresses.
2. Convert to lowercase
The easiest thing to start with is converting the program from UPPERCASE to lowercase. This may not seem like a big deal, but it will make the program easier to re-engineer. Don’t change the indenting at this point – languages like Fortran tend to use labels, and if the program is re-indented, these could be harder to decipher. You will find easy ways to covert to lowercase in “Useful Snippets of Code”.
3. Start with the easy things
The easiest things to change are the ones that will make the least impact. The first thing that should be changed is to copy the original code to a new file with a .f95 extension. Trying to compile this will mean that at least one thing won’t work – the comments. So in Fortran this means changing the comment delimiter from C to !. The program should now compile. Next convert the variable specifications. In Fortran this means changing any implicit variables to be explicitly declared. For example, early versions of Fortran allowed variables beginning with I, J, K, L, M, or N to be implicitly declared as integers. Aside from implicit variables, the specification syntax should also be modified. It is also a good idea to put the statement “implicit none” after the program header.
NB: Each time you make a core change to the program, make a new copy, so there is a progressive sequence of files associated with the re-engineer: if something goes eerily wrong (and it might), it is easy to go back to the copy that worked. Also make sure you compile the program after every change, and make sure it still functions as it should.
4. Restructured the simple ifs
The simple if statements should be dealt with next. This means converting them from their legacy format to if then- end if structure. Don’t deal with any if statements that are complex, such as arithmetic if’s.
5. Restructure the loops
The do statements are easy to convert into loops. This will mean converting them from do-continue statements to do-end do structures. This may seem like a trivial modification, but it means some of the labels in the program will disappear, making it less confusing.
6. Deal with the easy go to statements
The easy go to statements are those that are *somewhat* structured. Re-engineer the arithmetic-if and computed-if statements first.
7. Deal with the hard go to statements
There will be some go to statements that will be more challenging – this will require you to think about how to restructure these. Print out the code and annotate it, i.e. make lines from the jumping point to the label. Then ask yourself where the jumps go? Are there a bunch of jumps that jump back to the start of the program? This screams of a global containment loop with some continue type statements in there. Does the program need a function or two to make it work better, and eliminate some of the unstructured jumps?
8. Clean up the program
Once any remaining legacy features have been removed, deal with the aesthetics of the code. Do portions of the code require more indenting to make it more readable? There will likely still be labels in the program associated with format statements, and so it may not be possible to left-justify the program without it looking weird. Are there some variables that need to be renamed? Could the code be modularized in some manner? More appropriate documentation?
Things NOT to do
- Don’t rush in and start re-engineering the program ad-hoc.
- Don’t ignore a preliminary analysis of the program.
- Don’t rush to modularize the program until you are sure you know what is going on – and the unstructured jumps have been removed.