Increased intelligence = increased complexity
Consider the design of an automatic fastener recognition system for retailers. The idea is that when a customer wishes to purchase a “stainless-steel masonry bolt”, the fastener would be placed on a system which acquires an image, and returns a series of search results, which could be selected by the cashier. The image processing problem is simple (sometimes). Obtain an image, subtract from it a “template” image of the background without any objects on it, turn the resulting image into a black-and-white image containing the background (white) and the object (black), and proceed to extract features from the object such as length, width, circularity etc. At the same time the weight of the bolt is obtained using a built-in scale. When all the relevant information had been gathered, some form of learned algorithm is used to produce a list of the most probable items. Each sub-component is relatively simple in nature, the complexity may only arise when an attempt is made to integrate the systems.
Difficulties arise when there is a belief that such systems can be adapted to work with things such as identifying vegetables and fruit. The idea seems simple enough, but there is no guarantee that fruit and vegetables will always conform to a specific weight or shape category. It may be easy to determine an orange from a banana by shape, or colour, but how does the system distinguish between different types of oranges, or even organic from non-organic bananas? Now the system becomes inherently more complex, in part because colour information, and maybe even 3D shape information must be incorporated into the algorithm. However it may be impossible to derive a system which can distinguish between hot-house and field grown tomatoes. That’s why fruit and vegetables come with little stickers that bear product codes, or barcodes. Besides which barcodes can provide more information than just the type of fruit, for instance whether a piece of fruit is organic, or genetically modified, and even where it was grown.
Decrease complexity by altering the data
Sometimes the failure of an algorithm to perform is an indication of limitations in the data. Automatic number plate recognition (ANPR) systems have the capability of scanning one plate per second on cars traveling up to 160 km/h. ANPR systems typically use optical character recognition to identify the characters on a number plate. Characters such as “P” and “R” are similar and may lead to false classifications. To resolve this some countries such as the Netherlands have introduced small gaps in some letters to make them easier to identify. This is a good case for not making the algorithm more complex, but making the data easier to process. ￼Handwriting recognition is a good example. Whilst early pen-based personal organizers struggled with handwriting recognition algorithms, the Palm X used a simple alphabet. Training a person is often easier than training the software to recognize literally millions of different styles of writing. Handwriting recognition for address identification has improved over the years, but has this been because of the increase in machine printed addresses? Royal Mail processes about 50 million items per day, with approximately 70% being machine printed.
Complexity is based on need
Text segmentation is a good example for the case that complexity is based on need. Accurate text segmentation is a precursor to the process of optical character recognition (OCR), whereby written text is recognized. In the case of simple machine-derived text, segmentation is relatively easy if the context is dark text on a fairly uniform white background. If the paper is aged, yellowed, water-stained or contains foxing, the algorithm becomes more complex. Yet combinations of simple algorithms should be able to extract the text. Alternatively, extracting characters from handwritten text is more challenging, partially because everyones handwritten text is unique. In many cases it is easier for a human to try and decipher handwritten postal addresses than to design a more complex algorithm which has to be somehow tailored to individual needs.
Simplistic design nearly always works. The volume of data being produced can limit the usefulness of some complex algorithms. We live in a world mired by complexity. Have you ever wondered why handwriting recognition and voice-based systems haven’t made a huge impact?