Recent outages by Southwest Airlines, and Delta foreshadow a future of dealing with legacy systems. Southwest’s issues were blamed on a faulty router, Delta’s issues were with a (supposed) power outage, so neither was directly related to their legacy systems. Hardware fails. It is not omnipotent. I mean, my PVR fails routinely – we’re lucky if we get 3 years out of one before it starts to malfunction, and eventually die. But, it’s hard to blame Delta’s woes on a power-outage, because one would question why the airline did not have a back-up generator that seamlessly transitions over if power fails. Or maybe a redundant secondary site?
Most likely some form of human error caused the problem. I mean it has happened before. In 1997, a crew member on the U.S.S. Yorktown entered a 0 into a database field, which caused a divide-by-zero in the ships Remote Data Base Manager, which resulted in a buffer overflow, which brought down the ships propulsion system = no power. Simple, yet effective. The question might be one of how fragmented the legacy systems are. How complex have they been allowed to grow? Keep adding new features to an existing system and over the years they tend to become somewhat brittle. Complex, ill-tested code is often brittle, meaning the software may appear reliable but will fail badly when presented with unusual data. Brittleness in software can be caused by algorithms that do not work for the full range of input data.
How to prevent brittle software? Better testing, including worst-case scenarios. Better design of additions to legacy systems. And if the system becomes too complex, maybe it’s time to start thinking about re-building it from the ground-up.