Some people are paranoid about going to a place like London, because of all the cameras. There is reputed to be 1.85 million CCTV cameras in the UK, or roughly one camera for every 32 people. Big brother is watching. People are being tracked wherever they go, it’s almost like an episode of “Person of Interest”. But let’s think about it for a minute. In London alone there are 10,524 CCTV cameras. The London Tube station network has over 13,000 CCTV cameras. That’s a whole lot of data, which even if it isn’t stored for any great length of time has to be processed.
The real problem? Too many low resolution cameras. It doesn’t matter how many low resolution cameras are installed in an area, if facial details or the characters on a license plate are blurred. Minimum specifications for traffic cameras are about 720 x 576 pixels. Is higher resolution any better? Consider the following photograph of Canary Wharf Station in London (Image courtesy of James Peacock (Creative Commons, flickr ID# 4056621356)). This 4272×2848 (12MP) image has a file size of 14.4MB.
Having cameras using high resolution photographs only gives an overview of what is happening (and most high resolution CCTV cameras are probably 3-5MP in resolution). Three extracted 100×100 sub-images are shown below. This shows the inherent difficulty associated with CCTV images – even high resolution ones. The first sub-image shows a person’s face, but not clear, and certainly not one that could be enlarged or used for facial recognition. The second sub-image shows some text large enough to be recognizable. The third sub-image vaguely shows a persons face, but is extremely dark due to the lack of uniform lighting in the photograph. It may be possible to extract generic information about a persons face, but a positive identification is extremely unlikely.
The biggest caveat? – facial recognition itself. Facial recognition algorithms are hindered by the constraints of the imaging device, e.g. poor/uneven lighting, low resolution images, or faces too far from the camera. It works quite well for full frontal faces, or those 20 degrees off, anything beyond that and it struggles.
What is really need is high-definition CCTV systems with hemispheric (360 degree) fields of view, optical zoom which can be triggered remotely, and some form of intelligent screening, not unlike “The Machine” from Person of Interest, a mass surveillance machine.