Image binarization (7) : What about local thresholding algorithms?

Binarization come sin two flavours: global and local. Global techniques, like those reliant on histograms, use information from the whole image to find one threshold which will make a binary image. Local techniques determine a different threshold value for every pixel based on characteristics of their surrounding area. Localized techniques are sometimes seen as a panacea for images with small details. But do they work? Sometimes they do, but not always. They often rely on the use of statistics from a local neighbourhood, and as such suffer from a lack of global context, i.e. they don’t know that one pixel may be associated with another in an object or region. Consider the following example which thresholds a piece of writing with a paper background suffering from some form of deterioration.

Here is its histogram:

Histogram of text image

Firstly, consider the global version of Otsu. Clearly the result is not optimal, because half the text has been obliterated by the background.

Global Otsu

Now consider three localized  thresholding algorithms local Otsu, Niblack, and Sauvola. The first algorithm tested is a localized version of Otsu. Localized Otsu does segment the text, but also segments every other artifact in the image. The region around the stain in the paper is difficult to differentiate text from stain.

Localized Otsu

Niblack’s localized algorithm does not fair much better, in fact it produces even more non-text related artifacts than Otsu.

Niblack

Finally Sauvola’s algorithm almost perfectly segments the text from the background, practically ignoring the stain or discontinuities in the background of the paper.

So what are these algorithms used for? They have primarily been developed over the years for the task of text segmentation, and usually they do a magnificent job in that realm… sometimes the image does need to be cleaned up, e.g. background suppression applied. Finally, consider the global version of Otsu. Clearly the result is not optimal, because half the text has been obliterated by the background.

 

 

Advertisements

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s