Image enhancement (1) : The basics

Image enhancement involves improving the quality of the image, either for the purpose of aesthetic appeal, or for further processing. Therefore you are either enhancing features within an image, or suppressing artifacts. The basic forms of enhancement include:

  • contrast enhancement : enhancing the overall contrast of an image, to improve dynamic range of intensities.
  • noise suppression : reducing the effect of noise contained within an image
  • sharpening : improving the acuity of features within an image.

These relate to both grayscale and colour images, and there are additional mechanisms for colour images to deal with enhancing colour. The trick with any of these enhancement mechanisms is determining when they have achieved the required effect. In image processing this is often a case of the rest being “in the eye of the beholder”. A photo who’s colour has been enriched may seem pleasing to one person, and saturated to another.

To illustrate, consider the following example. This image is an 8-bit image that is 539×699 pixels in size.


Original image

Here its associated histogram:


From both the image and histogram, it is possible to discern that the image lacks contrast, with the majority of gray intensities situated between 25 and 195. So one of the enhancements could be to improve its contrast. Here is the result of a simple histogram stretching:


Contrast enhancement through histogram stretching

It may then be interesting to smooth noise in the image or, sharpen the image to enhance the letters in the advertising. The sub-image extracted from the above shows three different techniques (click on it to get the full effect).


Forms of image enhancement (sub-image extracted from contrast enhanced image): original (top-left), noise suppression using a 3×3 mean filter (top-right), image sharpening using unsharp masking (bottom-left), and unsharp masking applied after mean filtering (bottom-right).



Read your own punch cards

Ever wonder how punch cards worked? Before the advent of programs that could just be edited on a machine, programmers had to deal with punch cards. Doesn’t seem like a big deal to us. But what was involved in using these punch cards? Punch cards meant that every line of code in a program had to have its own card. So if your program had 1000 LOC, then you had 1000 cards. Don’t trip now…


Now most punch cards, had the statement they represented printed at the top of the card. The example above shows the statement Z(1) = Y + W(1). This Fortran card has 80-columns, and shows clearly the 7 columns of the fixed formatting. Now interpreting the card (lets assume we don’t know what’s on it), requires using EBCD, a subset of EBCDIC (Extended Binary Coded Decimal Interchange Code).

Notice that there are nine rows numbers 1 to 9, a row 0, and two unmarked rows above that which are usually marked Y and below that X. Each character of the statement is encoded using 0-3 punches in the card, with 0 representing a space. For example, Z is encoded as a punch in 0 and 9. Using the table below, one can see that the intersection of 0-9 is Z. The value 1 has only one punch in the 1. The coding for +mis Y-6-8.


Of course there were many different types of cards as well. This allows for 64 characters to be used in a program. In todays perspective this isn’t a lot. Notice what’s missing? Lowercase characters. This is one of the reasons early programming were coded in uppercase, and not lowercase. It would of course have been possible to increase the size of the card to accommodate lowercase characters, or an extended character set, but that might have impacted the structural integrity of the card, and may have required making a thicker card.

Check out this video on how to make punch cards:

We don’t know how lucky we are.

Python’s Achilles heel

I have mentioned Python’s lack of efficient processing for image processing before, but there is a bigger issue, and I am reminded of it every time I leave Python code for a few months. If the code has dependencies, then when I come back to it, one of these dependencies always seems to break. And herein lies Python’s biggest Achilles heels – it dependencies on libraries. Look, there is nothing wrong with external libraries, but a language should not have so many issues. It starts with Python 2 vs. 3… maybe Python 3 should be called something else – Boa? Anaconda? Something else. There is a reason C++ wasn’t named C 2 (although a better name than C++ would have been great).

So this morning I tried to run some code I probably wrote a year ago. First issue? OpenCV wasn’t installed on my machine (obviously I ran the code on my old machine).

ImportError: No module named cv2

Okay, so the problems are compounded by the fact that I can choose from CV2 or CV3… but to install OpenCV I had to first update brew. Then install OpenCV (CV2):

brew update
brew install opencv

Okay, fine. Then it told me I had to make more changes if I wanted to actually import cv2:

mkdir -p /Users/username/Library/Python/2.7/lib/python/site-packages
echo 'import site; site.addsitedir("/usr/local/lib/python2.7/site-packages")' >> /Users/username/Library/Python/2.7/lib/python/site-packages/homebrew.pth

Then I tried to run my code, and here’s what I got:

RuntimeError: module compiled against API version 0xa but this version of numpy is 0x9
ImportError: numpy.core.multiarray failed to import

Like, are you kidding me? So okay, I thought maybe I need to update numpy? Nope. Apparently there are two versions of numpy on the system (something to do with using pip vs. brew). So thanks to stackoverflow, I figured out I had to run:

sudo easy_install numpy

Now it runs. But I have to tell you , what a hassle! Ultimately maybe there should be an easier way of dealing with these dependencies. That’s probably why writing code in C, or Julia is nice. You can import libraries, but it is easy to do. That and most things that are critically needed are included in the language. Life would be better if numpy were just integrated into the language. At the end of the day, I don’t want to deal with these issues – and it impacts how portable code is. If I post code in Julia, all anyone has to do is have the compiler for Julia installed, and run the code. With Python,, they have to make sure all the dependencies are also loaded.

That’s one of the reasons Python is challenging for novice programmers to use.


P.S. To add insult to injury, now running some code that was working fine before… still is but now one of the other libraries is complaining…

/System/Library/Frameworks/Python.framework/Versions/2.7/Extras/lib/python/scipy/signal/ VisibleDeprecationWarning: `rank` is deprecated; use the `ndim` attribute or function instead. To find the rank of a matrix see `numpy.linalg.matrix_rank`.
 if rank(in1) == rank(in2) == 0: # scalar inputs

Re-engineering loops for efficiency (ii)

There are a couple more ways of improving loop efficiencies:

Loop unrolling

Every iteration of a loop requires modification and testing of the loop variable. This over- head can be reduced if the number of iterations of the loop is small, say less than five. This is called unrolling the loop. The previous example in loop fusion, can be unrolled, further reducing overhead:

int i, a[1000], b[1000];
for (i=0; i<1000; i=i+2)
    a[i] = 0;
    b[i] = 0;
    a[i+1] = 0;
    b[i+1] = 0; 

The most extreme case is to eliminate the loop completely and work as sequential line code. For example:

for (i=1; i<=3; i=i+1) {
    x = x + sqrt(i);

would arguably be better expressed as:

x = x + sqrt(1) + sqrt(2) + sqrt(3);

removing loop-independent expressions

Expressions in loops that perform a calculation independent of the loop should be evalu- ated outside the loop. This is sometimes known as loop streamlining:

for (i=1; i<=1000; i=i+1)
    sum = sum + pow(x,4.0);

should be replaced by

powX = pow(x,4.0);
for (i=1; i<=1000; i=i+1)
    sum = sum + powX;

We remove the expression pow(x,4.0) because its calculation is independent of the loop and would otherwise be calculated 1000 times. Consider the following nested loop:

for (i=1; i<=100; i=i+1)
    for (j=1; j<=100; j=j+1)
        for (k=1; k<=100; k=k+1)

The nested structure loops 100 × 100 × 100 = 1,000,000 times, meaning that a statement such as pow(x,4.0) would be executed 1 million times. For a nested loop struc- ture, the deeper a loop is nested, the higher the dividend with respect to efficiency. Always optimize inner loops first. The same principle can be applied to nested loops, by moving independent expres- sions from an inner loop outward. For example, some repeated calculations make use of the subscript as part of the calculation:

for (i=1; i<=100; i=i+1)
    for (j=1; j<=100; j=j+1)
        A[i][j] = B[i][j] + c/i;

The calculation c/i is a repeat calculation and should be removed from the inner loop. A better way to code this is

for (i=1; i<=100; i=i+1){
    ci = c / i;
    for (j=1; j<=100; j=j+1)
        A[i][j] = B[i][j] + ci;

Now the value c/i is calculated 100 times instead of 10,000 times.


What is a grayscale image?

If you are starting to learn about image processing then you will likely be dealing with grayscale or 8-bit images. This effectively means that they contain 2^8 or 256 different shades of gray, from 0 (black), to 255 (white). They are the simplest form of image to create image processing algorithms for. There are some image types that are more than 8-bit, e.g. 10-bit (1024 shades of grey), but in reality these are only used in specialist applications. Why? Doesn’t more shades of grey mean a better image? Not necessarily.

The main reason? Blame your human visual system. It is designed for colour, having three cone photoreceptors for conveying colour information that allows humans to perceive approximately 10 million unique colours. It has been suggested that from the perspective of grays, human eyes cannot perceptually see the difference between 32 and 256 graylevel intensities (there is only one photoreceptor with deals with black and white). So 256 levels of gray are really for the benefit of the machine, and although the machine would be just as happy processing 1024, it is likely not needed.

Here is an example. Consider the following photo of the London Blitz, WW2 (New Times Paris Bureau Collection).


This is a nice grayscale image, because it has a good distribution of intensity values from 0 to 255 (which is not always easy to find). Here is the histogram:


Now consider the image, reduced to 8, 16, 32, 64, and 128 intensity levels. Here is a montage of the results, shown in the form of a region extracted form he original image.


Not that there is very little perceivable difference, except at 8 intensity levels, where the image starts to become somewhat grainy. Now consider a companion of this enlarged region showing only 256 (left) versus 32 (right) intensity levels.


Can you see the difference? There is very little difference, especially when viewed in the over context of the complete image.

Many historic images look like they are grayscale, but in fact they are anything but. They may be slightly yellowish or brown in colour, either due to the photographic process, or due to aging of the photographic medium. There is no benefit to processing these type of photographs as colour images however, they should be converted to 8-bit.

Quote from “The Mind of Mechanical Man”

The early development of “electronic brains” prompted much debate on AI. In June 1949, Sir Geoffrey Jefferson (prof. of neurosurgery at Uni. Manchester), made the following statement in a speech entitled “The Mind of Mechanical Man”:

“Not until a machine can write a sonnet or compose a concerto because of thoughts and emotions felt, and not by the chance fall of symbols, could we agree that machine equals brain – that is, not only write it but know that it had written it. No machine could feel pleasure at its success, grief when its valves fuse, be warmed by flattery, be made miserable by its mistakes, be charmed by sex, be angry or miserable when it cannot get what it wants.”

Re-engineering loops for efficiency (i)

When it comes to program efficiency, most times people tend to ignore it. I mean computers get faster all the time, so inefficiencies in a program will eventually dissipate, right? One of the more interesting area for efficiency is loops. Why? Because loops more often than not are used to process large pieces of data, like images. Sometimes it is the language being used that is inherently inefficient when it comes to loops – prime example? Python.

Loop Fusion

The best way of cutting down the amount of time spent in loops is to decrease the number of loops, hence decreasing overhead associated with incrementing indexes and testing. This is termed loop fusion, jamming, or merging. For example, consider the following two loops, each of which initializes an array:

int a[1000], b[1000], i, j;
for (i=0; i<1000; i=i+1)    
    a[i] = 0;
for (i=0; i<1000; i=i+1)
    b[i] = 0;

These loops involve 2000 index increases and tests. This can be reduced by half by merging the loops:

int a[1000], b[1000], i;

for (i=0; i<1000; i=i+1){
    a[i] = 0;
    b[i] = 0;


Loop peeling (or loop splitting) attempts to simplify a loop or eliminate dependencies by breaking a loop into multiple loops that have the same bodies but iterate over different contiguous portions of the index range. A useful special case is simplifying a loop with a problematic first (or first few) iteration by performing that iteration separately before entering the loop. Here is an example of loop peeling. Suppose the original code looks like this:

int vector[100], r[100], i;
for (i=0; i<1000; i=i+1){
    r[i] = vector[i-1] * vector[i];

The problem here is the fact that the loop can not safely handle the case of vector[i-1] when i = 0, for the index becomes -1, which is illegal. A better solution would be to peel the first iteration off:

int vector[100], r[100], i;
r[0] = vector[0];
for (i=1; i<1000; i=i+1){
    r[i] = vector[i-1] * vector[i];

An artistic filter (in Julia)

Here is a cool filter which applies a “jitter” to an image. The filter basically selects a new pixel at random from a 5×5 region around a pixel in the original image.


Before the filter is applied


After the filter is applied

Here’s the Julia code:

function jitter(img, amnt=10)

    dx,dy = size(img) 
    imgN = copy(img)
    for x=1:dx, y=1:dy
        nx = round(Int, x + (rand()-0.5) * amnt)
        ny = round(Int, y + (rand()-0.5) * amnt)
        if (nx < 1)
            nx = 1
        elseif (nx > dx)
            nx = dx
        if (ny < 1)
            ny = 1
        elseif (ny > dy)
            ny = dy
        pixel = img[nx,ny,:]
        imgN[x,y,:] = pixel
    return imgN

Simpler Japanese toilet user interfaces…

It appears that due to visitor confusion (and likely some of the populace as well), the Japanese toilet industry has decided to standardize the symbols used on their technologically savvy toilets.

Complexity made more challenging by a lack of standardized symbols… until now.

Here is are the standardized pictograms:

Want to learn more? Apparently Toto, one of Japan’s biggest toilet makers has just opened a $60M toilet museum in Japan.