People often design algorithms, but implement them very poorly. They concentrate on the accuracy issues and forget many of the other characteristics that make an algorithm. Although correctness is extremely important, it is surprising how many published algorithms fail to identify what they actually do and the circumstances around which they function.
In many respects, algorithms have to be translatable to any language, and portable to any platform. Dependency on particular machines, hardware, or programming libraries all make algorithms less universal. An algorithm should also be intelligible. If only the person designing the algorithm actually understands it, it is much more challenging to make it universal. A lack of intelligibility makes it harder to port or translate, often resulting in inefficient and inelegant code. Algorithm robustness is also important because most users of the algorithm will treat it like a black-box – they may understand how it functions, but not how it is implemented. Lastly there is efficiency. Making an algorithm run faster is certainly important, but not at the expense of any of the above criteria.
First there is correctness. This can be achieved in most programs using the following basic criteria:
- Test all statements – Has every path in the program been tested using appropriate test data? What happens when something fails? Is there some level of redundancy?
- Initialization – Are all variables and aggregate structures assigned values before they are used?
- Overflow and underflow – Could the evaluation of an expression somewhere lead to overflow or underflow? Is the data type being used appropriate for the calculations being performed?
- Array subscripts – Are the correct array subscripts being used in an expression? Are the array subscripts within the bounds of the array?
- Extreme test cases – Has the algorithm been tested on any extreme test cases? Does a sorting algorithm work with a list of 1,000 integers, 50,000 integers?
Universality is less of an issue nowadays, however there are sometimes algorithms, such as those written in Matlab or Python that make use of libraries which are not universal. Trying to replicate such algorithms in other languages is both challenging and time consuming. Intelligible algorithms are those with clarity. An algorithm implemented in any language should be easy to read, and well documented. Compactness of code (here’s looking at you C) is good, but only in so far that in can be interpreted by the algorithm’s user. Some things are as simple as naming identifiers. Don’t use identifiers that can be confused with basic built-in functions, or identifiers such as I (capital eye), l (small ell), O (capital oh), o (small oh), because they are far too often confused with the digits 1 and 0.
Robustness is key to a good algorithm, and it is seemingly good practice to implement some form of defensive programming, i.e. better to over-check than under-check. For example the following implementation of a Fortran factorial algorithm works well, but contains no check on the validity of n.
integer (kind=int64) function factorial(n) implicit none integer, intent(in) :: n integer :: i integer (kind=int64) :: f f = 1 do i = 2, n f = f * i end do factorial = f end function factorial
The upper value of int64 is 9223372036854775807, so the maximum value of n! calculated can be 20!, which is 2432902008176640000. This means that the function will fail if n is not 1 < n < 21.
Finally there is efficiency. This is less of an issue than it once was, but there are still things that can be done to ensure things go a little faster, or are less susceptible to failure. Here are some things to consider:
- Recursion – Don’t use recursion if a simple loop will work. There are many instances where recursion provides the most efficient algorithm, e.g. Quicksort, but use it sparingly, and only if the resources are available. There is a reason why NASA’s software rules forbid the use of recursion.
- Repetition – Do not calculate the same value more than once.
- Multiplication – Multiplication is usually faster than exponentiation. These things still matter if performing millions of calculations.