Unless you reside in a remote place in India, I am sure, you have witnessed first hand, the proliferation of CCTV cameras around you. CCTVs have become a workhorse for security for tackling crimes ranging from terrorists attacks to burglary.
But is the installation of CCTV cameras alone sufficient for a robust security system ? Lets look at some of the challenges in this field
Most CCTV systems today come with device called Digital Video Recorder (DVR) [or Network Video Recorder (NVR) ].
It is a single board computer whose primary function is to store the camera feed on a hard disk. It also displays the camera feed through VGA/HDMI port on monitor. This enables monitoring by security personnel and postmortem analysis of crime scene. Would n’t it be nice if computer vision techniques can assist and make security personnel’s life easier? Face detection is ubiquitous on cameras and face recognition has become commonplace since the deep learning revolution took off. But as students and practitioners of AI, we know that computer vision lot more to offer : action recognition, people counting, people and vehicle tracking, affective computing, localisation and mapping, to name a few techniques.
After the initial euphoria of face recognition working in practice, it has received considerable backlash due to privacy concerns.
The intentions of researchers and developers might be noble, but there is great danger of misuse of this by totalitarian regimes to create surveillance state or even by tech savvy cyber criminals.
Indian law enforcement agencies, too, have started using tools like AFRS developed by startups like Innefu Labs. Some Indian activists have started to raise concerns about face recognition and we might see some PILs filed in the courts in the near future.
Privacy concerns are legitimate. But machine learning engineers need labelled databases of faces to train their models. So, the challenge here is, how can we build machine learning models on data that we cannot see ? Can we build a privacy preserving AI system, which allows for data exchange between data owners and data scientists ?
Despite all the talk about trillion connected devices, what is the architecture used in most applications which build machine learning models sensitive data like faces ? It is centralised architecture. In most applications images of faces are uploaded to the cloud where GPU powered servers run machine learning inference code and push the result back to the mobile device.
Naturally, any breach of security at the server side leads to catastrophic consequences.
The mobile devices are tiny single board computer systems running ARM processors or even at time ESP32 micro-controllers. Deep learning algorithms on the other hand require Graphics Processing Units (GPUs) or Tensor Processing Units (TPUs) to accelerate the tensor algebra (multidimensional matrix operations) involved.
A major challenge is, therefore, to take the computation on to the device, so that we can build a system which transmits minimal sensitive information over the Internet.
Let me take you back to the first example of images of 26/11 terror attacks report from india.com, published on November 27, 2015
The image on the left is an iconic image. You might have seen this number of times. Important point to note, is that it is not an image from the security CCTV camera — it has been captured by photojournalist on ground zero. Image captured by the security camera is the one on the right.
This illustrates several important challenges
Face recognition has been quite successful in a number of applications. However, security camera deployments pose challenges which necessitates looking beyond face recognition.
I hope that, I have convinced you, that there is a need to rethink the CCTV security camera setup to address some challenging problems. Our team at “dataeaze systems” is working on building solutions to address these challenges. In this blog series, we would like to introduce you to the topics involved in building effective visual surveillance systems based on deep learning. I hope you have found the introductory part interesting. Keep watching this space, for subsequent parts.