2D face recognition is an extensively studied, but still evolving subject of research. Various strategies including statistical approaches, hidden Markov models, neural networks, template based and feature based matching have been proposed. Here we briefly present our implementation which is based on past research and achieves state-of-the-art recognition performance on considerably low resolution input facial images.
Our approach can be divided into three independent phases: Facial landmarks library construction (offline), Building of facial descriptor (once per novel image) and Facial descriptors matching.
Facial landmarks library construction
A set of training images is marked by hand. Coordinates of important facial landmarks (such as lips’ corners, nose tip, etc.) are stored in a database for further processing.
Using information about landmarks position it is easy to geometrically transform and align training images.
Illumination correction is applied to transformed images in order to get rid of shadows and glares and normalize overall exposure.
Gabor jets are then extracted from normalized images in every landmark location. Extracted jets are stored in facial landmarks library for further using during novel images processing.
Building of facial descriptor
Having received a novel facial image we first try to locate approximate eyes position using our hybrid method of Viola-Jones and Bayesian classifier. The purpose why eyes coordinates are required is twofold: they are used to geometrically wrap and align input image and also to get initial estimate of coordinates of other facial landmarks.
Illumination correction and background clipping is then performed.
Starting from approximate eyes position and using samples from facial landmarks library we iteratively find precise locations of all facial landmarks on novel face.
Found landmarks positions are then used to extract Gabor jets and construct informative facial descriptor. Original input image is not required any more after this step.
Facial descriptors matching
The last and the simplest step is matching two facial descriptors which yields a similarity measure between two faces – a real number between 0 (nothing in common) and 1 (complete match). A threshold found by experiment is used to make decision whether the faces belong to the same or different persons.
Descriptors matching is several orders of magnitude faster than descriptor building which makes it possible to match new face against a database of known persons in moderate time.
Some technical details
Working face size: 45×45 px
Descriptor building time: 300ms on P4 1300 MHz (single core).
Descriptors matching: 2ms on P4 1300 MHz (single core).
Recognition rate: Feret fa/fb: 88%, Yale Faces: 86%, Faces in wild: 73%. Both false positives and false negatives considered.