Is there something weird going on here?

Why does the red light on the  lead cars of the new Bombardier-Alstom MPM-10 subway trains on the Montreal Metro remind me of the vintage Cylon’s eye from Battlestar Galactica? There is something eerily weird about it.

Just a bit of fun for the 500th post! Coming in January, more image processing/digital photography posts, and a series of basic programming posts in Fortran and Ada. Stay tuned!

Why we should all share code

In the 1980s, when OO began to emerge, it was touted as a great way for people to write code, and share what they write, so others wouldn’t have to re-write code, the golden orb of reusability. It never really worked out that way, although sharing code in repositories like Github has taken it a long way. Sometimes though, these pieces of software can be cryptically written, so that the non-computer scientist has no real clue what the code is doing – maybe it lacks proper commenting, or maybe it uses weird structures that the reader cannot decipher. Over the years I have seen an abundance of this type of code. Not poorly written per se, but not written from the perspective that a non-expert (or anyone) could read it. At other times, the code just doesn’t exist. You read a paper, and wonder how the algorithm works, but the information in the paper is lacking some crucial component to re-implement the code, or (heaven forbid), the results were fudged. And then of course, we all get old, and sometimes we can’t quite remember how a piece of code we wrote ten years ago *actually* works.

The idea of sharing code manifests itself from the belief that scholarly work should be shared with the broader community – especially research which is funded by tax-payer dollars (but that’s a whole blog post on it own). How are we to evolve if we don’t share our ideas (instead of patenting every thing ever created)? The reasons people don’t share code are obvious  – they consider it proprietary, they want to write lots of papers on it, they just don’t like to share, or the code simple does not work well. Some code is easy to break, meaning that it might work on the specific simple examples shown in a publication, but not hard examples, they will fail – hence their reluctancy to share brittle code.

Why should we share code?

Firstly, sharing code means that other people will be able to use it. By providing code, people do not have to try and reproduce the code from a vaguely written algorithm in an article. Sharing code also provides a means of verifying that an algorithm works correctly. It allows easy comparison of new algorithms with existing algorithms. People is diverse fields such as geography, archeology, and biology might be interested in using an algorithm but because they may not necessarily be expert programmers, would benefit from being able to use the code, but not have to write it. Secondly, sharing code might force people to write better code. One of the reasons I suspect a lot of code is not shared is that it is sloppy. Poorly written, using ad-hoc structures, and vaguely documented. Small changes in input results in code crashing.

Will this ever happen?  There are glimpses of hope. Sites like RunMyCode, which allow scientists to openly share code. But it is unlikely to any great extent until people start writing better code, and start willingly sharing their code.

 

 

Pupil segmentation with blobs

Extracting the pupil from an image of the eye didn’t exactly go so well using the Hough transform. How about blobs? Or rather BLOBs, which are defined as Binary Large OBject and refers to a group of connected pixels in a binary image. OpenCV (CV2) actually incorporates a means of finding blobs (an indeterminate shape), in the guise of SimpleBlobDetector_create(). How does it work?

Now  SimpleBlobDetector_create()  has a bunch of parameters that can be set – which of course is a doubled edged sword, and requires some tinkering. First it creates several binary images from the original image using the parameters minThreshold, maxThreshold, and thresholdStep. So if minThreshold=0 and maxThreshold=255, it will produce 256 binary images. Then in the binary images pixels (white) are grouped together. Next the centres of the blobs are calculated, and any closer than the parameter minDistBetweenBlobs are merged. Finally the centres and radii of the new blobs are computed and returned.

It is also possible to filter blobs by colour, size and shape:

  • Colour:
    • filterByColor=1, and  blobColor=0 (for dark blobs)
  • Area:
    • filterByArea=1, minArea=100, maxArea=500 (blobs between 100 and 500 pixels)
  • Shape:
    • filterByCircularity = 1, then set minCircularity, and maxCircularity (circle = 1)
    • filterByConvexity = 1, then set minConvexity, and maxConvexity
    • filterByIntertia=1, then set minInertiaRatio, and maxInertiaRatio (circle=1, ellipse = 0→1)

Lots to think about right? But there is nothing too complicated about these parameters. Once they are assigned, a detector can be created, and then blobs detected:

detector = cv2.SimpleBlobDetector_create(params)
keyPoints = detector.detect(im)

Easy right? Hold on, not so fast. You have to make sure you pick the right parameters. For the pupil, I set up filterByArea, and filterByCircularity. Considering the pupil is a circular object, circularity makes sense. Usually in an application sense, the size of the image of the eye being taken will be relatively uniform, and hence setting parameters for the upper and lower bounds of the area of the pupil makes sense. In this case the image is about 600×600 in size. Here is the Python code:

import cv2
import math
import numpy as np;

filename = raw_input('Enter a file name: ')

# Read image (im=gray, img=colour)
im = cv2.imread(filename, 0)
img = cv2.imread(filename)

# Setup SimpleBlobDetector parameters.
params = cv2.SimpleBlobDetector_Params()

params.filterByArea = True
params.minArea = 3000
params.maxArea = 6000
params.filterByCircularity = True
params.minCircularity = 0.5

# Set up the detector with default parameters.
detector = cv2.SimpleBlobDetector_create(params)

# Detect blobs.
keyPoints = detector.detect(im)

for keypoint in keyPoints:
   x = int(keypoint.pt[0])
   y = int(keypoint.pt[1])
   s = keypoint.size
   r = int(math.floor(s/2))
   print x, y
   cv2.circle(img, (x, y), r, (255, 255, 0), 2)

cv2.imwrite("blobOutput.png", img)

Most of the code is taken up with setting up parameters. Area is set as 3000→6000 pixels, and circularity is set to 0.5. The detector is then applied, and any blobs marked on the image.

Does it work? Here is the output:

So the algorithm effectively finds the pupil. But does it work effectively with other eye images? The reality is no.  Consider the image below (which failed):

Eyes where the contrast between the pupil and the surrounding iris is reduced, due to iris colour, may result in the algorithm failing. There are inherently many issues: (i) pigmentation of the iris: green, gray, brown, hazel; (ii) position of the eyelids, i.e. from the viewpoint of obstruction; (iii) how dilated the pupil is, (iv) interference from things such as make-up, and (v) reflections from light.

There is no easy algorithm in image processing.

 

 

 

What life means – and it’s not the number 42

In Douglas Adams, A Hitchhikers Guide to the Galaxy, the supercomputer Deep Thought spent 7½ million years computing the Answer to the Ultimate Question of Life, The Universe, and Everything. In the end the answer was 42. If life were but that simple. 42 would be a nice way of explaining a lot of things. “My code didn’t work” – “well, it’s likely the number 42 that’s to blame”. But the reality is that life is much simpler than a number.

Life is about experience.

Humans learn best through experience – be it good or bad. Sometimes we make mistakes, and we learn from them. Through experiences we learn to adapt things, usually through some process of trial-and-error. Yet modern society seems to have some predisposition with a form of education that is far removed from experiential learning. When a student tells you they learned more in a week at a developers conference than four years in university, you know the system is broken. Is there anything we could have truly not learned by trying something for ourselves? Listening to people lecture is not experiential learning. I taught myself to program by writing programs. I taught myself to bake by baking. I taught myself to build a shed, by building a shed. Learning how to do something of course involves *some* knowledge, but that can usually be obtained by reading a book/blog, watching a video, or observing someone. That’s why hands-on workshops in the artisan world work so well. Want to learn how to forge an axe? Take a class an axe making class at Toronto Blacksmith. You don’t really need to know everything there is to know about axes if you experience making one.

Experience is everywhere. If you want to build better insulated buildings for a colder climes, the best way to learn about that is visiting places where the climate is cold. There are often building techniques that we would never consider, because, well, we’ve never experienced them. So teaching programming the same way we taught it 30 years ago is somewhat archaic. Anyone can really teach themselves the basics tenets of programming in a few weeks using a language like Julia. The reality is that many software developers in the world are self-taught. A 2016 survey by Stackoverflow of 56,033 developers world-wide showed that 56% did not have a degree in computer science or related fields. For some companies hiring a developer is more about a persons portfolio of products than how many A’s you got in university. A person that has developed 2-3 successful apps is likely more beneficial to a company than someone you has never developed anything more than a small school project. That’s the reality in a hands-on profession. Want to read more, check out this 2016 article by Eric Elliott, Want to Code? A University Degree Might be a Huge Waste of Time.

Universities need to change the way they do things to become institutions of experiential learning. Times are a’changing, and education has to evolve back into building things, solving problems, and critically thinking about the world around us.

 

An example of the Hough transform – pupil segmentation

In a previous post I talk about using the Hough transform in counting logs in a pile. Now let’s turn our sights on to a simpler problem – that of extracting the pupil from the image of a eye, the first step in many eye-tracking algorithms. Now extracting the pupil from an image of the eye is usually quite simple, even though taking a image of the eye is not without problems. Taking an image of an eye can lead to reflections of what the eye is looking at, and also lighting artifacts. Here is a sample image (from wikipedia).

The pupil is normally uniformly dark in colour, and the iris which surrounds it is more variegated, but also well differentiated from the remainder of the eye. This image of course is a near perfect image, because both the pupil and the iris are clearly visible. In many images of the eye, portions of the iris may be obstructed by the eye lids.

So extracting the pupil must be an easy task using the Hough transform, right? Think again. Even beautifully round objects are hard to extract automatically, partially because of the number of differing parameters required by a function like OpenCV’s  cv2.HoughCircles(). Some parameters control the type of edge based segmentation performed on the grayscale version of the above image, and getting those parameters right often requires some intensive tweaking. Once you have found the pupil in one image, try replicating that in another. It’s not easy. Oh, it works fine in many of the examples showing how to use the function because they are often easy examples. It is a perfect illustration of an algorithm that seems useful, until you actually use it. There is no free lunch.

Here is an example of using the Hough transform. First we take an image with what are essentially some randomly placed black circles on a white background. Should be an easy task to find the circles using a Hough transform.

Below is the code we used:

import cv2
import numpy as np

filename = raw_input('Enter a file name: ')

# Read in grayscale version of image
imgG = cv2.imread(filename,0)
# Read in colour version of image
imgO = cv2.imread(filename)

# Process the image for circles using the Hough transform
circles = cv2.HoughCircles(imgG, cv2.HOUGH_GRADIENT, 1, 50, 
              param1=30, param2=15, minRadius=0, maxRadius=0)

# Determine if any circles were found
if circles is None:
    print "No circles found"
else:
    # convert the (x, y) coordinates and radius 
    # of the circles to integers
    circles = np.round(circles[0, :]).astype("int")

    # draw the circles
    for (x, y, r) in circles:
        cv2.circle(imgO, (x, y), r, (255, 255, 0), 1)

    # display the image
    cv2.imshow("output", imgO)
    cv2.waitKey(0)

Below is the result, with the cyan circles denoting the circles found by the Hough transform.

It’s by no means perfect, *but* it has found all the circles. So now let’s test the same code on the eye image. First we convert the image to grayscale (done in the program), and then apply HoughCircles() with the same parameters. Here is the result:

Did it find the pupil? … Not really. It found plenty of things that aren’t circular objects, and it got close, but certainly didn’t outline the pupil in any sort of effective manner (never mind all the false-positives). What about if we preprocess it? Say, grayscale it, then apply a local threshold (say the algorithm of Phansalkar). This is the pre-processed image:

Now here is the Hough transformed image:

Worse? Oh, SO much more. It has turned every eyelash into a potential circle (because they are arcs I would guess). Whether or not it actually found the pupil is hard to tell. We could likely get better results by tweaking the parameters – but how long would this take?

Another approach might be to apply Canny thresholding before submitting the image to cv2.HoughCircles(). Will this work? The problem with Canny is inherently that it too has parameters, in the guise of high and low thresholds. This problem can be alleviated by using Otsu thresholding to determine the thresholds for Canny [1]. Here is the modified code:

import cv2
import numpy as np

filename = raw_input('Enter a file name: ')

imgG = cv2.imread(filename,0)
imgO = cv2.imread(filename)

thV,thI = cv2.threshold(imgG,0,255,cv2.THRESH_BINARY+cv2.THRESH_OTSU)
highTH = thV
lowTH = thV / 2

# Find the binary image with edges from the thresholded image
imgE = cv2.Canny(imgG, threshold1=lowTH, threshold2=highTH)
cv2.imwrite('eyeCanny.png', imgE)

# Process the image for circles using the Hough transform
circles = cv2.HoughCircles(imgE,cv2.HOUGH_GRADIENT, 2, 30, 
          param1=30, param2=150, minRadius=0, maxRadius=150)

# Determine if any circles were found
if circles is None:
    print "No circles found"
else:
    # convert the (x, y) coordinates and radius of the 
    # circles to integers
    circles = np.round(circles[0, :]).astype("int")

    # draw the circles
    for (x, y, r) in circles:
        cv2.circle(imgO, (x, y), r, (255, 2550, 0), 1)

    cv2.imwrite("houghOutput.png", imgO)

We have modified some of the HoughCircle() parameters, and added a value of 150 to maxRadius. Here is the output from the Canny edge detector:

Here is the subsequent output from the Hough transform:

Did it find the pupil? No. So it seems that extracting the pupil within an image of the eye is not exactly a trivial task using a circular Hough transform.

[1] The Study of an Application of Otsu Method in Canny Operator

 

How many logs can a woodchuck count?

In the forestry industry there is some value in pre-processing logs, with respect to size and shape,  before they arrive at the mill. Ideally, each log would be somehow run through a 3D scanner, and optimal milling decisions made. The reality is that the volume of logs is so great, that 3D scanning them is likely never going to be a reality. Sometimes, even trivial tasks such as determining the individual diameters of a series of logs in a pile, isn’t so trivial when we try and create an algorithm to do it. In reality, this is very similar to the problem of mushroom counting using computer vision. What is easy for the human eyes to do, may not be as easy to translate into a generic vision-based solution. Mushrooms are almost easier because they are often white… except for when they are brown. You get the picture. Wood on the other hand, can having varying colours even within a particular species, making automated vision challenging. Consider the following image of a pile of logs:

The first thing that comes to mind is of course to find circular objects in the image. Hough transforms maybe? Well, that would work, but it requires pre-processing to extract the edges of each of the logs. How do we deal with the differential nature of the wood colour? Could we just use colour? Unlikely due to the small difference in hues, and what if I wanted to use the algorithm on walnut? Let’s try and first homogenize the colours. in the image. We can do this using a Mean-Shift filter with a spatial radius of 9, and a colour distance of 25. Here is the result:

Mean-shift filter

Now ideally we could convert this image to grayscale, and apply some form of edge detection in order to find edges. This however, will only be as successful as there are edges in the image. You can clearly see the edges, and the human eyes can distinguish individual logs, however algorithms see things in a different light. First we convert it to grayscale:

Converted to grayscale.

Then we could use Canny edge detection algorithm to extract the edges of each of the logs.

Canny edge detection: Parameters – Gaussian kernel radius = 1.7, low threshold=2.5, high threshold =7.5

How successful was this? Canny is one of those algorithms which is highly dependent on a series of input parameters to direct the algorithm. Too low a Gaussian will find more edges, but will also find more edges, i.e. edges which may not necessarily be useful, and more likely confuse post processing algorithms. Too high a Gaussian, and the outlines of the logs will become more abstract, i.e. they will loose edge profile, but there will be less spurious edges. Ideally we would like an edge detection algorithm which is automatic – yet few (if any) of those exist that are generic enough to be applied to all images. Adding to the problem are overlapping logs, portions of the sides of logs visible in the image, and differential colours.

The result could now be applied to a Hough transform, which may or may not work well. Here is the result of using the a small piece of Python code (using the function HoughCircles from the computer vision library CV2).

Here is the code that did it:

circles = cv2.HoughCircles(img,cv2.HOUGH_GRADIENT,1,20,
 param1=40,param2=25,minRadius=40,maxRadius=80)

As you can see, HoughCircles has *7* parameters apart from the image to be processed, so finding the sweet spot is challenging. Made harder by the fact that the logs are by no means homogeneous. Many of the examples showing how to use the Hough transform to find circular objects use ideal circular objects, and maybe only a couple of them. Finding logs using the Hough transform isn’t perfect, but it has managed to find some. Modifying the parameters may find more. But then one set of parameters may not work on another image. There is also the problem of Hough itself, which works on the principle of using a “model” to find features in what is typically an edge image. For HoughCircles, that model is a perfect circle, and not all log cross-sections conform to a perfect circle, nor are they all perfectly aligned to eliminate shadows.

Now as a problem, counting logs is a good one, but there may be a reason why there is so little literature out there in the ether – because it may be inherently challenging to produce a system that will work in the diversity of environments, and diversity of tree species/sizes. There was a paper written in 1994 (pre-digital image) [1] on using the Hough transform to perform this very task, but the testing was more than lacklustre, and no real images were presented. A more recent paper [2] takes a different approach to the problem, without using Hough, which shows some promise.

If someone is interested in doing this for a MSc. project, we could look into industrial funding.

REFS:

[1] Brodie, J.R., Hansen, A.C., Reid, J.F., “Size assessment of stacked logs via the Hough transform”, Trans. of the ASAE, 37(1), pp.303-310 (1994).
[2] Rahman Shaik, A., “Image processing technique to count the number of logs in a timber truck”,

P.S. Here is a good tutorial on using cv2.HoughCircles.
Note: The pre-processing was done in ImageJ, but it wouldn’t be any different in a specifically written program.

 

Image mean filtering (iii) – in C

The last piece of code related to the mean filter is written in C. The first thing to notice is that the images are stored using a static array, which means that they are statically stored (put succinctly – 0 initialized static data goes in .BSS (Block Started by Symbol), non 0 initialized data goes in .DATA). To be able to properly store large images, it’s best to use dynamic arrays stored in heap memory, but that’s another story altogether. Here is the C code with relevant sections coloured.

#include <stdio.h>
#define dx 2144
#define dy 6640

int main(void)
{
   FILE *ifp, *ofp;
   int i, j, x, y, w=2, sum;
   static int image[dx][dy];
   static int imageN[dx][dy];

   ifp = fopen("pano.txt","r");
   for (i=0; i<dx; i=i+1)
      for (j=0; j<dy; j=j+1)
         fscanf(ifp,"%d", &image[i][j]);
   fclose(ifp);

   for (i=0; i<dx; i=i+1)
      for (j=0; j<dy; j=j+1)
         imageN[i][j] = image[i][j];

   for (i=w; i<dx-w; i=i+1)
      for (j=w; j<dy-w; j=j+1){
         sum = 0;
         for (x=-w; x<=w; x=x+1)
            for (y=-w; y<=w; y=y+1)
               sum = sum + image[i+x][j+y];
         imageN[i][j] = (int) sum/25.0;
       }

   ofp = fopen("meanPano.txt","w");
   for (i=0; i<dx; i=i+1){
      for (j=0; j<dy; j=j+1)
         fprintf(ofp,"%d ", imageN[i][j]);
      fprintf(ofp,"\n");
   }
   fclose(ofp);

   return 0;
}

LEGEND: image input, mean filtering, filtered image output, array copy

The C version of mean filtering program is clearly the fastest, at 0.875 seconds. But what it has in speed it lacks in usability, at least from the novice programmer’s perspective. The I/O sections of code require two for loops each, to iterate through the file to rad/write each element. Copying the image also requires a nested loop, and calculating the mean at each pixel requires four nested for loops. It isn;t that loops are bad, they are just tedious. Ten for loops versus the 1 (albeit nested) loop in Julia. For the novice programmer wishing to do a little image processing, C just provides too many hurdles to do it in an expedient manner.

No rapid prototyping, and *no* free lunch in C.

P.S. You can find more information on C memory on this blog or a good overview here.

 

 

 

Where is static memory stored in C?

C is probably the best language to program in if you care about the intricacies of memory. Heap, stack, and static memory – they are all slightly different. So where is static memory stored?

Firstly, what does static mean?  The term static denotes one of two storage classes in C, the other being automatic. A local variable that is declared as static does not loose its value between function calls, in other words they are essentially global variables, but have the scope of the function they are defined in. Static global variables and functions are not visible outside the C file they are defined in. Here is an example:

#include <stdio.h>

int staticfun()
{
   static int index = 0;
   index = index + 1;
   return index;
}

int main()
{
   printf("%d ", staticfun());
   printf("%d ", staticfun());
   return 0;
}

The output when this program is run will be: 1 2. This is because the variable index inside the function staticfun() retains its value, even after the function has ended. If the static keyword were removed, then the output would be “1 1“, as the variable index would be re-initialized every time the function is called. Variables that are created as static (as well as global variables) are stored in the data segment. There are usually two parts to this:

(i) the uninitialized data segment, which stores uninitialized variables. This is sometimes referred to as BSS (which stands for “Block Started by a Symbol”, an old assembly language describing a pseudo-code used to allocate storage for an uninitialized static array). For example:

int aGlobal;
static int someStaticVar;

(ii) the initialized data segment, which stores initialized variables. For example:

int aGlobal = 1;
static int someStaticVar = 12;

The best way to understand this better is through an example. Let’s consider the most basic program, with basically nothing but a skeleton code (simple.c).

#include <stdio.h>
int main(void)
{
   return 0;
}

Now if we compile it using “gcc -ggdb -c simple.c“, we get an object file that we can process further. We can use the size command (prints the number of decimal bytes) to find out some basic information (on OSX):

> size simple.o
__TEXT  __DATA  __OBJC  others  dec     hex
92      0       0       1021    1113    459
 If we add a simple static int:
#include <stdio.h>
int main(void)
{
   static int i;
   return 0;
}

When we run size, we get:

> size simple.o
__TEXT  __DATA  __OBJC  others  dec     hex
92      4       0       1056    1152    480
Now the static variable has been created n the _DATA object, and is 4 bytes in size. Now let’s create a new program static.c, which has a 10×10 static array of int’s.
#include <stdio.h>

int main(void)
{
   int i, j;
   static int image[10][10];

   for (i=0; i<10; i=i+1)
      for (j=0; j<10; j=j+1)
         image[i][j] = 0;

   return 0;
}
 Running size on the object file created gives us:
> size simple.o
__TEXT  __DATA  __OBJC  others  dec     hex
180     400     0       1194    1774    6ee
Now all 100 elements of the array at 4 bytes a piece are stored in _DATA.