Research Advances in Facial Expression Recognition Technology

Research Article
Open access

Research Advances in Facial Expression Recognition Technology

Rui Jian 1*
  • 1 Cretin-Derham Hall High School, Saint Paul, United States of America    
  • *corresponding author ruijian34@gmail.com
Published on 26 November 2024 | https://doi.org/10.54254/2755-2721/80/2024CH0070
ACE Vol.80
ISSN (Print): 2755-2721
ISSN (Online): 2755-273X
ISBN (Print): 978-1-83558-561-0
ISBN (Online): 978-1-83558-562-7

Abstract

As a key area in artificial intelligence and computer vision, facial expression analysis technology has been taking hold rapidly, yielding dazzling advances. This technology extracts even the subtlest facial movements to identify when a human is sad, happy, or angry. It has extensive application value in markets like mental diagnosis, secure monitoring, and intelligent interaction. The paper will provide a brief overview of the kernel technology of facial expressions recognition: geometric features, appearance features, and deep learning. And compared the experimental results of different experiments, as a goal, the experiments compare functions of several methods in a multi-dataset context, concentrating the primary problems in the tasks like dataset bias, real-time requirements, and different light levels and partial occlusions in various environments. Eventually, the paper forecasts the imminent growth trend in data fusion and deep learning improvement. This also indicates that emerging technologies, especially in terms of applications, will have the upper hand.

Keywords:

Facial expression recognition, Deep learning, Geometric features, Emotion computing.

Jian,R. (2024). Research Advances in Facial Expression Recognition Technology. Applied and Computational Engineering,80,115-118.
Export citation

1. Introduction

In the past few years, facial expression recognition technology has attracted significant attention and application. It can identify human emotional statuses by evaluating minor fluctuations in facial expressions, using multi-discipline details like psychology, computer vision, and artificial intelligence. The use of facial expressions has long been considered the most natural and most universal way to express feelings. Through correctly mapping human expressions, machines will be able to comprehend and reply in an appropriate manner. This in turn would open up various application domains, including intelligent human-computer interaction, health monitoring, and security[1].

In the field of autonomous driving, facial expression recognition technology helps systems detect driver fatigue, hence improves driving safety [2]. In the area of mental health, it aids in psychological therapy by revolving on trauma and mental problem identification [3]. Moreover, the facial expression internet of things devices is being increasingly applied in education, virtual reality, and entertainment [4]. As it was proved, because of the increase in demand for intelligent systems, facial expression recognition technology is currently the leading technology in emotional computing.

The area of deep learning has progressed a lot in recent years, and in particular, facial expression recognition technology has seen some great enhancements.

Conventional methods using geometric features and visual features can achieve results only under limited restrictions. Hence, convolutional neural networks have been employed to improve the accuracy and robustness of the model, where CNNs automatically organize features of the original images to high levels [5].

2. Overview of Facial Expression Recognition Methods

Facial expression recognition technology has been developed in several stages. These stages are based on geometric features, appearance features, and deep learning. Each method has its own merits and demerits, making them proper for different situations [6].

2.1. Geometric Feature Methods

Geometric features are the first technique that facial expression recognition is applied to. They compute the degree of separation and angles between key facial landmarks.

By detecting the positions of the eyes, mouth, and eyebrows, it is possible to analyze the facial expression state of the person. These methods often rely on classical machine learning algorithms such as support vector machines (SVMs) or random forests [7].

2.2. Appearance Feature Methods

Appearance feature methods apply to the extraction process of facial information by analyzing wrinkles and textures. Approaches such as LGDV (Local Grayscale Distribution Variation) and texture variations in small image segments are used. Traditional methods like Local Binary Patterns (LBP) and Scale Invariant Feature Transform (SIFT) are often employed for feature extraction. Zhao and Pietikäinen's research confirmed the robustness of LBP in terms of lighting changes, but its adaptability to expression variations was limited [8].

2.3. Deep Learning Methods

In recent years, deep learning, particularly convolutional neural networks (CNNs), has led to breakthroughs in facial expression recognition. CNNs automatically learn high-level image features through multi-layer convolution operations, and they handle large-scale and diverse image data. This has made CNNs extremely effective in tasks involving complex facial expressions with varying poses, lighting conditions, or occlusion. CNNs have contributed significantly to progress in the field [9]. Deep learning has also benefited from transfer learning, where models achieve good performance on small datasets by learning from larger datasets, thus overcoming data scarcity problems [10]. Attention mechanisms have also been introduced to enhance CNN performance by focusing on important regions of the face [11].

3. Experiments and Analysis

To compare geometric features, appearance features, and deep learning methods, this paper conducts experiments and provides analysis. Several public datasets were used, and the paper employed common evaluation metrics to assess the performance of these methods.

3.1. Datasets

The experiments used the following three public facial expression datasets:

1. FER2013: This dataset contains a variety of facial images with nine basic emotions and includes variations in lighting and pose [12].

2. CK+: The CK-Plus (Cohn-Kanade extended) dataset contains image sequences of facial expressions with different intensity levels [13].

3. AffectNet: This large dataset captures facial expressions in natural environments, with complex emotional circumstances [14].

3.2. Evaluation Metrics

The paper used the following metrics to compare the performance of the methods:

Accuracy: The proportion of correctly identified expressions.

Recall: The proportion of samples correctly identified for a specific category.

F1-score: The harmonic mean of accuracy and recall, useful for balancing performance in imbalanced datasets.

3.3. Deep Learning Methods

The following Table 1 shows the experimental results of geometric feature methods, appearance feature methods, and deep learning methods:

Table 1. The experiments used the following three public facial expression datasets.

Method

Dataset

Accuracy

Recall

LBP-based geometric feature method

FER2013

82%

78%

SIFT-based appearance feature method

CK+

85%

81%

VGGNet deep learning method

AffectNet

92%

88%

3.4. Experimental Results Analysis

On the FER2013 dataset, the LBP-based geometric feature method achieved an accuracy of 82% and a recall of 78% [15]. These methods perform well in conditions with minimal lighting changes and simple backgrounds. However, their limitations become evident when dealing with more complex facial expressions, as seen in the low recall rate for the diverse FER2013 dataset. The SIFT-based appearance feature method achieved better results with an accuracy of 85% and a recall of 81% on the CK+ dataset, indicating its ability to handle pose and lighting variations [16]. However, appearance-based methods are computationally expensive, limiting their real-time application. Deep learning methods outperformed traditional methods, with VGGNet achieving the highest accuracy (92%) and recall (88%) on the AffectNet dataset [17]. The complexity of expressions and natural environments in AffectNet were handled effectively through CNNs, which learned both low- and high-level features automatically.

4. Conclusion

This article reviews different methods of facial expression recognition technology and compares experimental results. Deep learning methods, particularly CNN-based approaches, outperform traditional geometric and appearance-based methods in complex and diverse environments. While traditional methods still have specific use cases, especially in low-resource settings, deep learning is the most promising direction for advancing facial expression recognition in real-world applications.


References

[1]. Cowie, R., Douglas-Cowie, E., Tsapatsoulis, N., Votsis, G., Kollias, S., Fellenz, W., & Taylor, J. G. (2001). Emotion recognition in human-computer interaction. IEEE Signal Processing Magazine, 18(1), 32-80.

[2]. McDuff, D., El Kaliouby, R., Kodra, E., LaFrance, M., & Picard, R. W. (2016). AFFDEX SDK: A cross-platform real-time multi-face expression recognition toolkit. In Proceedings of the 2016 CHI Conference on Human Factors in Computing Systems (pp. 3723-3730). ACM.

[3]. Girard, J. M., Cohn, J. F., Mahoor, M. H., Mavadati, S. M., & Rosenwald, D. P. (2014). Detecting depression severity from vocal prosody. IEEE Transactions on Affective Computing, 5(4), 384-394.

[4]. Zhao, G., & Pietikäinen, M. (2007). Dynamic texture recognition using local binary patterns with an application to facial expressions. IEEE Transactions on Pattern Analysis and Machine Intelligence, 29(6), 915-928.

[5]. Ekman, P., & Friesen, W. V. (1978). Facial action coding system (FACS): Manual. Consulting Psychologists Press.

[6]. Tang, Y., Mu, J., Liu, W., & Luo, Z. (2020). Deep learning for predicting emotions from text. Journal of Computational Science, 45, 101164.

[7]. Soleymani, M., Chanel, G., Kierkels, J. J. M., & Pun, T. (2012). A multimodal database for affect recognition and implicit tagging. IEEE Transactions on Affective Computing, 3(1), 42-55.

[8]. Fasel, B., & Luettin, J. (2003). Automatic facial expression analysis: A survey. Pattern Recognition, 36(1), 259-275.

[9]. Ahonen, T., Hadid, A., & Pietikäinen, M. (2006). Face description with local binary patterns: Application to face recognition. IEEE Transactions on Pattern Analysis and Machine Intelligence, 28(12), 2037-2041.

[10]. Lowe, D. G. (2004). Distinctive image features from scale-invariant keypoints. International Journal of Computer Vision, 60(2), 91-110.

[11]. Wang, Z., Wu, Z., Peng, X., & Wang, Y. (2018). A comprehensive survey on face image analysis: From traditional methods to deep learning. arXiv preprint arXiv:1804.08399.

[12]. Zeng, Z., Pantic, M., Roisman, G. I., & Huang, T. S. (2009). A survey of affect recognition methods: Audio, visual, and spontaneous expressions. IEEE Transactions on Pattern Analysis and Machine Intelligence, 31(1), 39-58.

[13]. Liu, P., Han, S., Meng, Z., & Tong, Y. (2013). Emotion recognition using hidden Markov models from facial expressions. Journal of Image and Vision Computing, 31(10), 848-858.

[14]. Ben-David, S., Blitzer, J., Crammer, K., & Pereira, F. (2014). Understanding machine learning: From theory to algorithms. Cambridge University Press.

[15]. Calvo, R. A., & D’Mello, S. (2010). Affect detection: An interdisciplinary review of models, methods, and their applications. IEEE Transactions on Affective Computing, 1(1), 18-37.

[16]. Li, S. Z., & Jain, A. K. (Eds.). (2011). Handbook of face recognition. Springer.

[17]. Martinez, B., Valstar, M., & Pantic, M. (2017). Learning deep features for facial expression recognition with confidence estimation. In 2017 12th IEEE International Conference on Automatic Face & Gesture Recognition (FG 2017) (pp. 298-305). IEEE.


Cite this article

Jian,R. (2024). Research Advances in Facial Expression Recognition Technology. Applied and Computational Engineering,80,115-118.

Data availability

The datasets used and/or analyzed during the current study will be available from the authors upon reasonable request.

Disclaimer/Publisher's Note

The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of EWA Publishing and/or the editor(s). EWA Publishing and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

About volume

Volume title: Proceedings of CONF-MLA Workshop: Mastering the Art of GANs: Unleashing Creativity with Generative Adversarial Networks

ISBN:978-1-83558-561-0(Print) / 978-1-83558-562-7(Online)
Editor:Mustafa ISTANBULLU, Marwan Omar
Conference website: https://2024.confmla.org/
Conference date: 21 November 2024
Series: Applied and Computational Engineering
Volume number: Vol.80
ISSN:2755-2721(Print) / 2755-273X(Online)

© 2024 by the author(s). Licensee EWA Publishing, Oxford, UK. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license. Authors who publish this series agree to the following terms:
1. Authors retain copyright and grant the series right of first publication with the work simultaneously licensed under a Creative Commons Attribution License that allows others to share the work with an acknowledgment of the work's authorship and initial publication in this series.
2. Authors are able to enter into separate, additional contractual arrangements for the non-exclusive distribution of the series's published version of the work (e.g., post it to an institutional repository or publish it in a book), with an acknowledgment of its initial publication in this series.
3. Authors are permitted and encouraged to post their work online (e.g., in institutional repositories or on their website) prior to and during the submission process, as it can lead to productive exchanges, as well as earlier and greater citation of published work (See Open access policy for details).

References

[1]. Cowie, R., Douglas-Cowie, E., Tsapatsoulis, N., Votsis, G., Kollias, S., Fellenz, W., & Taylor, J. G. (2001). Emotion recognition in human-computer interaction. IEEE Signal Processing Magazine, 18(1), 32-80.

[2]. McDuff, D., El Kaliouby, R., Kodra, E., LaFrance, M., & Picard, R. W. (2016). AFFDEX SDK: A cross-platform real-time multi-face expression recognition toolkit. In Proceedings of the 2016 CHI Conference on Human Factors in Computing Systems (pp. 3723-3730). ACM.

[3]. Girard, J. M., Cohn, J. F., Mahoor, M. H., Mavadati, S. M., & Rosenwald, D. P. (2014). Detecting depression severity from vocal prosody. IEEE Transactions on Affective Computing, 5(4), 384-394.

[4]. Zhao, G., & Pietikäinen, M. (2007). Dynamic texture recognition using local binary patterns with an application to facial expressions. IEEE Transactions on Pattern Analysis and Machine Intelligence, 29(6), 915-928.

[5]. Ekman, P., & Friesen, W. V. (1978). Facial action coding system (FACS): Manual. Consulting Psychologists Press.

[6]. Tang, Y., Mu, J., Liu, W., & Luo, Z. (2020). Deep learning for predicting emotions from text. Journal of Computational Science, 45, 101164.

[7]. Soleymani, M., Chanel, G., Kierkels, J. J. M., & Pun, T. (2012). A multimodal database for affect recognition and implicit tagging. IEEE Transactions on Affective Computing, 3(1), 42-55.

[8]. Fasel, B., & Luettin, J. (2003). Automatic facial expression analysis: A survey. Pattern Recognition, 36(1), 259-275.

[9]. Ahonen, T., Hadid, A., & Pietikäinen, M. (2006). Face description with local binary patterns: Application to face recognition. IEEE Transactions on Pattern Analysis and Machine Intelligence, 28(12), 2037-2041.

[10]. Lowe, D. G. (2004). Distinctive image features from scale-invariant keypoints. International Journal of Computer Vision, 60(2), 91-110.

[11]. Wang, Z., Wu, Z., Peng, X., & Wang, Y. (2018). A comprehensive survey on face image analysis: From traditional methods to deep learning. arXiv preprint arXiv:1804.08399.

[12]. Zeng, Z., Pantic, M., Roisman, G. I., & Huang, T. S. (2009). A survey of affect recognition methods: Audio, visual, and spontaneous expressions. IEEE Transactions on Pattern Analysis and Machine Intelligence, 31(1), 39-58.

[13]. Liu, P., Han, S., Meng, Z., & Tong, Y. (2013). Emotion recognition using hidden Markov models from facial expressions. Journal of Image and Vision Computing, 31(10), 848-858.

[14]. Ben-David, S., Blitzer, J., Crammer, K., & Pereira, F. (2014). Understanding machine learning: From theory to algorithms. Cambridge University Press.

[15]. Calvo, R. A., & D’Mello, S. (2010). Affect detection: An interdisciplinary review of models, methods, and their applications. IEEE Transactions on Affective Computing, 1(1), 18-37.

[16]. Li, S. Z., & Jain, A. K. (Eds.). (2011). Handbook of face recognition. Springer.

[17]. Martinez, B., Valstar, M., & Pantic, M. (2017). Learning deep features for facial expression recognition with confidence estimation. In 2017 12th IEEE International Conference on Automatic Face & Gesture Recognition (FG 2017) (pp. 298-305). IEEE.