
A critical examination of deficiencies in the convolutional neural network model for facial emotion recognition
- 1 East China University of Science and Technology
* Author to whom correspondence should be addressed.
Abstract
The challenge of addressing the issue of low accuracy in specific scenarios encountered during the implementation of facial emotion recognition systems arises due to the wide array of environments and varying conditions. In this study, the Facial Expression Recognition-2013 (FER-2013) dataset sourced from the Kaggle serves as the basis for training the models, with subsequent analysis conducted on the experimental outcomes. The dataset comprises a training set and a testing set, each annotated with labels representing seven distinct emotions, ranging from "angry" to "surprise". The models developed for facial emotion classification, tasked with automatically recognizing emotions based on provided images, consist of a MobileNet-based model and a self-built model employing convolutional neural networks. Both models exhibit an accuracy of approximately 60%, yet demonstrate deficiencies in predicting the "neutral" label. Additionally, the utilization of techniques such as confusion matrix and saliency map enable the comparative evaluation of model performance across different emotion labels and facilitates an analysis of their corresponding dominant facial regions. Based on a comparison of results obtained from representative cases, two potential factors contributing to these limitations are identified: a paucity of training data and the presence of ambiguous features. The findings of this study are expected to inform future directions for improvement and modification of facial emotion recognition models in order to enhance their applicability in diverse scenarios.
Keywords
facial emotion recognition, convolutional neural network, confusion matrix, saliency map
[1]. Ekman P Friesen W V 1978 Facial Action Coding System (FACS): a Technique for the Measurement of Facial Actions Rivista Di Psichiatria.
[2]. Jonitta C et al 2020 Deep Learning based Facial Expression Recognition for Psychological Health Analysis 2020 International Conference on Communication and Signal Processing (ICCSP) (Chennai, India) pp 1155-1158.
[3]. Shojaeilangari S et al 2015 Robust Representation and Recognition of Facial Emotions Using Extreme Sparse Learning in IEEE Transactions on Image Processing vol 24 no 7 pp 2140-2152.
[4]. Cao H et al 2021 Facial Expression Study Based on 3D Facial Emotion Recognition 2021 20th International Conference on Ubiquitous Computing and Communications (IUCC/CIT/DSCI/SmartCNS) London, United Kingdom pp 375-381.
[5]. Han B et al 2023 Masked FER-2013: Augmented Dataset for Facial Expression Recognition 2023 IEEE Conference on Virtual Reality and 3D User Interfaces Abstracts and Workshops (VRW) (Shanghai, China) pp 747-748.
[6]. Kaggle 2020 FER-2013 https://www.kaggle.com/datasets/msambare/fer2013?select=test.
[7]. Yu Q Chang C S Yan J L et al 2019 Semantic segmentation of intracranial hemorrhages in head CT scans 2019 IEEE 10th International Conference on Software Engineering and Service Science (ICSESS) IEEE pp 112-115.
[8]. Sharif Razavian A Azizpour H Sullivan J et al 2014 CNN features off-the-shelf: an astounding baseline for recognition Proceedings of the IEEE conference on computer vision and pattern recognition workshops pp 806-813.
[9]. Wang S Y Wang O Zhang R et al 2020 CNN-generated images are surprisingly easy to spot. for now Proceedings of the IEEE/CVF conference on computer vision and pattern recognition pp 8695-8704.
[10]. Kayalibay B Jensen G van der Smagt P 2017 CNN-based segmentation of medical imaging data arXiv preprint arXiv:1701.03056 2017.
Cite this article
Tao,Y. (2023). A critical examination of deficiencies in the convolutional neural network model for facial emotion recognition. Applied and Computational Engineering,22,19-27.
Data availability
The datasets used and/or analyzed during the current study will be available from the authors upon reasonable request.
Disclaimer/Publisher's Note
The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of EWA Publishing and/or the editor(s). EWA Publishing and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.
About volume
Volume title: Proceedings of the 5th International Conference on Computing and Data Science
© 2024 by the author(s). Licensee EWA Publishing, Oxford, UK. This article is an open access article distributed under the terms and
conditions of the Creative Commons Attribution (CC BY) license. Authors who
publish this series agree to the following terms:
1. Authors retain copyright and grant the series right of first publication with the work simultaneously licensed under a Creative Commons
Attribution License that allows others to share the work with an acknowledgment of the work's authorship and initial publication in this
series.
2. Authors are able to enter into separate, additional contractual arrangements for the non-exclusive distribution of the series's published
version of the work (e.g., post it to an institutional repository or publish it in a book), with an acknowledgment of its initial
publication in this series.
3. Authors are permitted and encouraged to post their work online (e.g., in institutional repositories or on their website) prior to and
during the submission process, as it can lead to productive exchanges, as well as earlier and greater citation of published work (See
Open access policy for details).