Painting a portrait of facial recognition: facts and figures

As with any burgeoning technology empowered by successful deployments, face recognition still draws questions surrounding its performance, its potential in operations and implementation.

 

From the very advent of photography, both government agencies and private organizations have kept collections of portraits and ID photos have gradually made their way onto all personal identification documents, from the most official passports to informal membership cards issued by sports clubs. Before the use of computers to recognize faces was even considered a possibility, face recognition was already the subject of a great deal of research.

Examples include: witness interview – development of identification parade or “line-up” techniques in the United Kingdom, in which a witness is confronted with a group of physically similar people, one of whom is a suspect. The witness must decide whether one of the persons in the group was present at the scene of the crime. Face classification – in order to recognize delinquents who are repeatedly arrested, without having to resort to large collections of portraits, Bertillon suggested that the portraits be sorted by common morphological characteristics, i.e. the specific shapes of the different parts of the face. This classification is known as the ‘spoken portrait’.

Digital capture

Recognizing faces is the most natural thing to do for the human brain. After a few days, even newborns can recognize their mother’s face. Besides, capturing portraits digitally is easy: contactless, with no need for specific equipment now that cameras are everywhere, in the streets, in computers, even integrated in the smartphone in your pocket. But when it comes to identifying someone from his or her face, it is not because it is natural that it is simple. The face is a three dimensional ever changing object always in motion. For an efficient identification system, face recognition technology starts well before image comparison.

Detection

The first step is to detect the face in the images collected from the source. It can be easy in static images such as identity documents with standardized pose, lighting and plain background. It can be a huge challenge in video streams with multiple persons in movement and a busy background. The goal is then to detect faces successfully, which means maintaining a low rate of false face detection. This stage is performed with a classifier that indicates whether an image of fixed size represents a face. The classifier learns from a database of faces and non faces.

Image enhancement

Once the faces have been found and adjusted to the same scale and position, they need to be enhanced. This involves minimizing the effects of compression, correcting inconsistent lighting or detecting and excluding unusable zones (masked by clothes for example). Morpho developed a 3D morpheable model to generate frontal images, correcting the orientation of the face and the effects of expressions. While many enhancements can be made automatically, the assistance of an operator may prove very useful when working on difficult images.

Feature extraction

The image is processed to extract information and convert it into a digital description for computer-based comparison. Every face has numerous, distinguishable traits, such as the distance between the eyes, the width of the nose or the shape of the cheekbones. Face recognition algorithms rely on those feature points but also on mathematical information not identifiable by the human eye. To increase accuracy, multiple approaches are used, amoung others the following are implemented: n Hierarchical graph matching: creates a dynamic link architecture that projects the face onto an elastic grid. n Skin texture analysis: compares minute skin wrinkles, pores, scars and other artifacts. n Principal component analysis: removes superfluous information (reduces data dimension) by breaking down the face structure into uncorrelated components known as eigenfaces.

Comparison

Once you have extracted the face features, you can compare them to the ones in the database or a watchlist and identify whether the person you are looking for is already in it or at least get a list of potential candidates that looked like him/her. The comparison, or matching, is achieved through a succession of algorithms in a multi-stage architecture: the first ones are conservative and fast, the last ones are slower but very selective. Each matching step provides a candidate list of which only the top is used for the next step. This approach narrows down the list to be able to use more demanding algorithms efficiently on a smaller amount of data. This process ensures both high accuracy and fast matching. The outcome of the comparison is a matching score per candidate measuring the similarity between two sets of face features and reflecting the confidence level that they are coming from the same person. The final step involves score normalization, in order to guarantee that the matching score remains stable.

Decision

In an identity management scenario, the goal is prevent identity fraud by checking the uniqueness of applicants’ biometric data. Based on huge biometric databases, comparisons will be much easier by using reliable thresholds that do not require adjustment as the database grows. In case of automatic “hit”, an operator can verify to confirm the result and proceed with the investigation on the fraud attempt. In law enforcement scenarios, a human reviewer is usually employed to systematically review the candidates returned from an identification search. Usually, the reviewer inspects the suggested candidates ordered in descending matching score, stopping when he is able to positively confirm a mate. The length of the candidate list may be fixed or variable by applying a threshold. In this case, he onlyreviews the candidates with a matching score above the threshold. Thanks to accurate face recognition algorithms, with images of reasonably good quality (such as mugshot), the use of such a threshold allows for a dramatical reduction of the reviewer workload. Indeed, the reviewer will only receive a small number of images that stand a high chance of matching the wanted person, thus preserving his time and attention to process more cases or spend more time on critical cases.

Authentication and ID

Face recognition is used for both authentication (checking a person’s identity) and identification (finding a person among a group of known people). Because face images can be collected either overtly or covertly, and since people are increasingly willing to give their portrait than before, this technology offers a wide range of applications from public security and civil identity to border control and more. It is used in day-to-day life by everybody, sometimes in unnoticeable ways. For example, it is used to index media archives, recognize VIP customers, restrain access to casinos, offer personalized pictures to leisure park visitors, help security guards limit the access to a shop or bank agency to ‘safe’ visitors, display the most attractive advertisement depending on age and gender, secure access to electronic devices.

Face recognition is a very challenging technology. Images vary a lot depending on scale and resolution, in sharpness and in lighting, notwithstanding artifacts like compression, interlacing or overlay for identity documents, red eyes when a flash light is used… Another element to take into account is the constant variations of the human face: it is a very mobile and deformable 3D object, which can change in seconds, and varies over time with age and physical condition. It even varies in color… So intrinsically, face images are extremely variable. This is the reason why the deployment of a face recognition system must take both technical and human factors into consideration to be used in the most efficient way. The environment must be controlled and its quality checked to get the best results.

Quality check

Having a good quality database is one of the keys to get the best results. When the portrait acquisition is done in a controlled environment, the enrolment should follow best practices: – Frontal pose – Neutral expression – Clear face to limit occlusion (glasses, headwear, etc) As assessed by the NIST: improvement of image quality is the largest contributing factor to recognition accuracy. When possible, the system shall conform to the ISO/ IEC 19794-5 standard, with two complementary approaches: by design using proper devices and illumination, and by detection of non conformant images at the collection stage.

Camera set-up

The results of identification in video streams depend a lot on the setting up of cameras in the environment. The stakeholders should define in advance how to implement the cameras. They must be set up at choke points, where people move at a controlled pace, flow and direction. Cameras should also fit the following requirements: n Control lighting, n Be placed at a suitable height to correctly acquire the faces of people of different sizes, n Capture people’s attention to make them look at the camera (mirror, advertisement, weather forecast, news etc.).

Checking searches

In investigation use cases, visually recognizing people can require some training. A number of methods have been developed to improve the visual recognition of persons. As an example, the Federal Bureau of Investigation (FBI) has proposed a facial comparison and identification training program to provide students with awareness and understanding of the facial comparison discipline. This training aims at facilitating the expanded use of face technologies, interpretation of the output of face recognition systems, and better integration of face biometrics into law enforcement and intelligence work.

Face recognition seems so simple and intuitive that expectations relative to this technology are sometimes out of proportion. For example, for the purpose of video screening, cameras must provide high resolution pictures to be used by a face recognition system. Every potential customer should conduct tests to assess the suitability of the technology to operational applications before proceeding with deployment. It is possible to test usage scenarios, check the results that may be obtained and measure the workload required.

Developments

Empowered by successful deployments, face recognition still has considerable headroom for improvement. For instance, these systems could add 3D sensors, recognition of moving faces, processing of images captured from above or the side, development of models to integrate ageing, and much more. In a near future, we expect to see a strong development of the use of such a technology since solutions will be more adapted and efficient in performance. But in any case, the performances of face recognition will definitely depend on the way the system is used and how it has been implemented.

by Safran Morpho