Image binarization (3) : What should be binarized?

In the end, not every image can be segmented using thresholding. The best images for thresholding are those where there is a good separation between the objects of interest. There are many images which cannot be properly segmented by any means (the presence of colour in images does help, but is not a panacea for segmentation). Good examples of things that lead to reasonable thresholding results include such things as text images, B&W drawings, and images where objects can be easily differentiated. Below are some examples.

thresholdinggood

Choosing a global or localized thresholding algorithm really depends in the content you wish to segment, and whether or not it is compromised by other things in the image, e.g. non-uniform background, overlapping objects etc. There is no perfect thresholding algorithm, and some images are not optimal for turning into a binary image, or even an image with four sections. It would be like trying to segment the image below.

 

Fortran re-engineering example (4)

Now we come to the crux of the re-engineering problem: goto

This is in reality a small program with uncomplicated unstructured constructs. But everyone has to start somewhere. The first four goto statements in reality form a type of “goto” based if-else statement. These can be converted. This means the code below can become much more readable and actually produce less code.

 5 if (totdis.gt.250) goto 10
   tower = 1
   dist = totdis
   goto 20

10 if (totdis.gt.750) goto 15
   tower = 2
   dist = iabs(totdis - 500)
   goto 20

15 tower = 3
   dist = 1000 - totdis

20 if (dist.le.30) vel = 2.425+0.00175*dist*dist

You can see how this works in the marked-up code shown earlier. This is re-engineered into the following structure:

 5 if (totdis.le.250) then
       tower = 1
       dist = totdis
   else if (totdis.le.750) then
       tower = 2
       dist = iabs(totdis - 500)
   else 
       tower = 3
       dist = 1000 - totdis 
   end if 

This means that the conditional statements have to change, because the original code jumped to another point in the program if the conditional is true, and essentially carried out the code after the if, only if the conditional was false. So the first conditional must change from (totdis.gt.250) to (totdis.le.250), and the second conditional changes from (totdis.gt.750) to (totdis.le.750). The code at label 15 becomes the else portion of the if-else structure.

Not forgetting that the newer Fortran if statement uses the then clause, and is terminated with end if. Now that the goto statements have been removed, the associated labels 10, 15 and 20 can also be deleted. Here’s the code at this point:

cablecar3

 

 

 

 

 

Fortran re-engineering example (3)

The next step involves fixing the small stuff, to get the program to actually compile. Most old Fortran programs can be compiled as is due to the backwards compatibility of Fortran compilers. However, convert the extension from .for to .f90 or .f95, and something will break. The first thing to break will be comments. All the “C” delimiters have to be changed to “!“. Then it might compile. Sometimes there are other finicky things that have to be fixed as well, but unstructured code will compile… usually. The “line continuation” operator will also have to change, from + to &.

Next, simple stuff like compressing the first format statement from 3 lines to 2, adding some whitespaces in places (e.g. if( → if ( ), and form single goto statements from “go to“. It is important to have a sequence of code files when re-engineering. Make some changes, document the changes in the file, and compile the program to make sure nothing breaks. Make a copy, and make new changes in that file. DO NOT make all the changes in the same file. That’s a CRAZY idea.

Finally, the variable declarations are modified to use :: .

Now here is what the program looks like:

cablecar2

Fortran re-engineering example (2)

The next step in the re-engineering process involves reading the code. People often forget to read the code. I can see the eyes rolling now… “that may work for small programs, but what if my code is 3000 lines long?”. Yes, I agree, it’s not easy to read a piece of code that long… but what other way is there to understand what is happening? Documentation? – yes that helps, *if* it exists. Maybe it doesn’t. Anyways, let’s look at the code (on paper).

cablecarmarkup

Notice from this analysis that there are two main pieces of unstructured code. The first relates to four goto statements and essentially constructs an if-else if-else statement. In the first if statement, if totdis is greater than 250, then it jumps to the code at label 10. If it is less-than-or-equal-to 250, then if performs the code directly below the if statement, then jumps to label 20, which represents the end of the conditional. Similarly with the next if statement. So by converting this piece of code, four goto statements disappear.

The remaining goto is associated with a global loop, encompassing more of the working code. It’s easy to convert this to a loop. Of course there are a myriad of smaller things, but this analysis has provided a basis for the next step.

Image binarization (2) : Art or science?

Image binarization is intrinsically linked to the image histogram, that 1D representation of the distribution of greys within an image. There are many forms of histogram, as outlined in a previous post. The trick is turning a grayscale image into a binary image without loosing key information, and to that end binarization is somewhat of an art. Why you ask? Shouldn’t it be a science? If you are processing images that are all very much alike, some experimentation will provide a means of thresholding the images so you get very similar results in all cases. For example, consider a factory that makes building toys that uses an image verification system to check that parts are accurate. All parts are photographed using consistent lighting, and there is sufficient contrast between objects and background to make the task easy and reproducible. Consider the image of the gear below (with suboptimal lighting as it is just an illustration). The task may be to determine that none of the gear’s teeth are defective.

gear

The histogram of this image looks like this:

gearHst

Which represents a bimodal distribution: one mode represents the gear, the other the background region in the image. This is an ideal scenario. Now a threshold can be selected somewhere in the “valley” between the two peaks. Naturally this image is not completely perfect, as a machine vision system would have optimal lighting which would illuminate the object in a uniform manner. If we binaries the image using Otsu’s algorithm, we get a threshold value of 108.

gearOtsu108

The algorithm produces a binary image which shows the teeth quite nicely, however the inner portion of the gear has one side, and two of the holes missing. This is not such a problem is we only care about the gears teeth. Would another algorithm do a better job?

Maybe.

Here are the results from applying 16 different binarization algorithms (courtesy of ImageJ):

gearThmontage

As you can see from the results, none of them is perfect. Many extract the teeth, but not the inner holes, others extract the inner holes at the expense of some of the teeth. It’s not a perfect system.

Now imagine trying to create a generic thresholding algorithm? Difficult? Yes, highly. So binarization becomes an art because you often have to use a creative approach to finding the appropriate algorithm to perform the thresholding – and in some cases it might be too complex an image for any algorithm to produce results.

 

 

Why some algorithms *seem* to work

It is interesting to read about new algorithms in image processing. As mentioned in the previous post, there are a multitude of contrast enhancement algorithms out there. Go on… read one or to of the papers associated with the algorithms. Then *try* to reproduce the algorithm. In some cases it will be easy. In others? Well, let’s just say that with the information given, the algorithm cannot be reproduced. Worse still? The author has taken the liberty of testing the algorithm on images which show dramatic results, or has failed to compare the results against a simple algorithm. Here’s one of the images used to test the algorithm:

fish

Fish image

Pretty blah right? Thats because the histogram is skewed completely to the left.

fishhist

Histogram of fish

Now many of the algorithms the authors use to compare their algorithm against fail on this image.  Not surprising… it has one peak. Histogram equalization? It works quite nicely.

fishhe

The published algorithm also works quite well… but then again, so does stretching the histogram.

fishesihe

This is one of the reasons it’s hard to get excited about the field of image processing sometimes. It has become overwhelmed with lack-luster algorithms.

 

Image processing fun with FFT! (iii)

Finally a scanned image of a persons face containing an artifact of the paper that the photo was printed on. These are not uncommon when scanning old photographs.

face2descreen

Image pre-restoration

Enter the FFT. The images below show the power spectrum of the image above before and after the frequency components responsible for the periodic noise have been filtered.

face2descreenegs

FFT power spectrum and filtered power spectrum

Now taking the inverse of the FFT results in an image with most of the periodic noise suppressed.

facecleaned

Image post-restoration

 

 

 

Image processing fun with FFT! (ii)

So how do we get rid of periodic noise? In the same manner as the filtering examples shown in the first post. The trick is identifying where the frequency is to be removed. Consider this image which shows a very prevalent periodic noise throughout the image. Any form of normal spatial filter would have a hard time removing this artifact.

aerialpompeiiperiodic

Enter the FFT. The images below show the power spectrum of the image above before and after the frequency components responsible for the periodic noise have been filtered.

aerialpompeiifftegs

Now taking the inverse of the FFT results in an image with the periodic noise gone.

aerialpompeiiclean

FFT helps isolate regular patterns in an image and remove them. Like magic!

 

 

 

 

Image processing fun with FFT! (i)

Okay, so I wouldn’t really classify Fast Fourier Transforms (FFT) as fun, but in their day, they were heavily used by people doing image processing. They still are to some extent, and there are some things they really excel at. One of those is removing periodic noise – the type of noise no spatial filter would come close to being able to remove.

The basics? A transform takes one signal and transforms it into another. The Discrete Fourier Transform (DFT) is one such function. Problem is it’s *really slow. In 1969, a 2048 point analysis of a seismic trace took 13 ½ hours. The FFT is a very efficient algorithm for performing a DFT – the same seismic trace took 2.4 seconds with the FFT. In image processing it is most often used for image restoration purposes. The FFT basically converts a signal from space domain to frequency domain.

Consider this image and its power spectrum derived using ImageJ.

tank

Grayscale image

tankfft

FFT Power spectrum (ImageJ)

Now in ImageJ, if we cut a portion of the power spectrum away, and leave it set to black, then we effectively filtering those frequencies.  If we cut a portion, but invert it to white, we are passing those frequencies. Consider the examples below, which illustrate these two scenarios applied to the power spectrum image above.

tankfft2

Filtering out versus passing frequencies

The results are shown below (after applying the inverse FFT). Using the “filter” option filters out the low frequency components of the image, leaving only the high frequency components. Using the passing filter, passes *only* the low frequency components, effectively removing all the edge structures, and leaving a blurry mess.

tankfft1

High frequency filter (top) versus low frequency passing (bottom)

Next post I’ll show you how to remove periodic noise.

 

 

 

Image binarization (1b) : It’s all about the data

Like many things in image processing, the quality of the output from image binarization is directly dependent on the quality of the data in the original image. Sometimes we believe an image will be easy to segment, but that our belief is clouded by our eye’s ability to extract objects, and we forget that algorithms see things in a different light. The human visual system gets a complete picture of whatever it is looking at, and therefore can interpret in in context. A computer gets a bunch of pixels.

Here are three images all containing different content. The first one is an aerial image with a unimodal histogram, the second shows an indistinct histogram biased towards the high intensity values. The third is bimodal, but the image itself does not contain two distinct features. All three images would be challenging to binarize.

thresholds1

Sometimes even though the data seems like it will process well, the subtleties are where the problems lie. Some images cannot be effectively turned into binary images – in fact some images cannot be effectively segmented in any manner.The next three images shows an image which is indistinct multimodal, a uni-modal image with mini-peaks, and a unimodal peak which spans the entire range of intensity values. These three images are also considered challenging to binarize.

thresholds2

There are also images that *seem* like they could be thresholded, however the data within the image complicates matters. Both of the first images have bimodal histograms, but what would be segmented?

thresholds3

With many image binarization algorithms, the threshold is calculated based on some global statistic, so although the histogram shows a bimodal distribution, this is only a representation of how the intensity values are distributed in the image, not where they are distributed. 100 different images could have the same distribution, but radically different thresholding outcomes. They don’t take into account the spatial distribution of the pixels.