Why image segmentation is difficult (1)

One of the most non-trivial tasks in image processing is segmentation. Segmentation is the process defining an image in such a manner that different objects can be extracted from it. In it’s simplest form, segmentation exists as a thresholding problem. I have an image with an object and a background, and they are distinct enough that I can extract them. But not all images come in this cookie cutter form. In fact the majority of them don’t.

Why is segmentation important? Well, it is the first step in trying to automatically determine what is in an image. But it isn’t an easy task, and there is no segmentation algorithm out there that is effective on all images. But why?

The main issue may be that we are hindered by our own vision system – humans can easy extract object information from what we see. We are even able to determine movement, and estimate the distance an object is from us. Yet we try and design algorithms which mimic the human visual system, a system with 100 million years of evolutionary design, which works on dynamic images in real-time. Humans can look at a cereal package, determine it is a rectangular box, and even estimate its approximate dimensions. By interpreting text and images on the sides of the cereal package humans are able to allude to its contents. Deriving an algorithm to perform the same using an image from a digital video stream is more of a challenge. The brain has models of the world, and thirty distinct information processing regions to deal with colour, texture, etc.

Here is a picture of a tiger in a zoo (Wikipedia: Eddy1988)

tiger

This is what the human vision system (focusing on the tiger) sees:

tigerWWS

Canny edge detection sees a bunch of lines, based on the specific parameters the algorithm is given (3 in this case). Sometimes it is hard for even a human to decipher anything from this jumble of lines, let alone extracting the mere shape of the tiger.

tigerCanny

K-means clustering based  segmentation (with 4 “objects”) sees this, which is somewhat better, but even here the tigers light coloured coat is marked the same as parts of the foreground. this algorithm is also dependent on some value k, representing the number of objects to find.

tigerKmeans4

There isn’t an algorithm which screams “Look, here is the tiger!”.

 

Advertisements

5 thoughts on “Why image segmentation is difficult (1)

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s