Face recognition is an important area of research in cognitive science and machine learning. This is the first paper utilizing deep learning techniques to model human’s attention for face recognition. In our attention model based on bilinear deep belief network (DBDN), the discriminant information is maximized in a frame of simulating the human visual cortex and human’s perception.
Comparative experiments demonstrate that from recognition accuracy our deep learning model outperforms both representative benchmark models and existing bio-inspired models. Furthermore, our model is able to automatically abstract and emphasize the important facial features and patterns which are consistent with the human’s attention map.
In the stage of global fine-tuning, we refine the parameter space for better face recognition performance. And it is consistent with the late peak related to the activation of “post-recognition”. After the deep learning model is constructed, the attention map is built based on the parameter space in the first RBM. Figure 1 shows the architecture of our bilinear deep belief network.
The weights of first layer of BDBN are oriented, Gabor-like and resemble the receptive fields of V1 simple cell (Zhong, et al., 2011). Therefore, the first RBM is utilized to construct the attention model which is shown in Figure 2.
FACE FEATURE POINTS EMPHASIS
Then, just like the procedure on face datasets, the original images are normalized (in scale and orientation) so that the two eyes are aligned at the same position. Finally, the facial areas are cropped and downsampled into the final images. The size of each final image in all of the experiments is 32×32 pixels, with 256 gray levels per pixel. Some sample images after preprocessing are shown in Figure 4.
CONCLUSION AND FUTURE WORK
In this paper, we make an attempt to construct an attention model for face recognition in a frame of simulating the human visual cortex and human’s perception. To evaluate proposed face recognition models, we do experiments on two face images datasets, CMU PIE and BioID. Experiments results not only show the distinguishing recognition ability of our deep model but also clearly demonstrate our intention of providing a human-like face image analysis by referencing the human visual cortex and perception procedure. It is the general opinion that advances in cognitive science especially neuroscience will provide useful insights to computer scientists into how computer models construct, and vice versa.
To a certain extent our attempt is an example to prove that the computational models are not only applied into the tasks of classification and recognition just as the optimal classifier, they also can provide human-like response by referencing the human visual system. In future, we will go on this direction to propose novel computational model by referring more characters of human visual system. And vice versa, in cognitive science, we will explore whether the human visual system possess the related mechanism which is consistent with the computational model from the viewpoint of mathematics.
Source: University of California
Authors: Sheng-hua Zhong | Yan Liu | Yao Zhang | Fu-lai Chung