Research on Artificial Intelligence and Face Recognition

Xiyun Yao

doi:10.54254/2755-2721/2024.17912

1. Introduction

Since the Dartmouth Conference in 1956, the field of artificial intelligence has witnessed rapid development, particularly in the domain of computer vision where remarkable achievements have been attained. Face recognition technology, as an important branch of computer vision, has played an increasingly critical role in our daily lives. From facial recognition payments to facial recognition access control gates, the application of this technology is becoming more widespread. At the same time, it has imposed higher requirements on its accuracy, sensitivity, and environmental adaptability. This paper will introduce the research background, research significance, and current status of face recognition. It will also summarize and analyze the mainstream methods in the field of face recognition, including various technologies such as geometric features, PCA, elastic graph matching, LHD, and SVM. By comparing these methods, the paper will reveal their respective advantages and limitations. Finally, the paper will look ahead to the future research direction of face recognition technology, aiming to provide a reference for researchers and engineers in related fields.The significance of this paper is not only reflected in providing ideas for innovation at the technical level but also in studying the far-reaching impact of face recognition on social security, personal privacy, protection and intelligent life experience.

2. An overview of face recognition

2.1. Background of Face Recognition

Facial recognition technology, a significant research area within computer vision and pattern recognition, identifies or verifies individuals by analyzing and recognizing their facial features. Recent progress in image processing and deep learning has greatly enhanced the accuracy and applicability of facial recognition systems. The fundamental workflow includes face detection, image acquisition, image processing, feature extraction, face matching, and result output. Initially, computers can capture high-quality facial images via specialized imaging devices and then locate the face within these images for further analysis. This paper focuses on the processes after successful image acquisition. Subsequently, appropriate image processing techniques are chosen based on the selected facial recognition method. After that, relevant features are extracted from the processed images for analysis. The next step is to compare these extracted features with those stored in a database to achieve accurate identification or verification results. Ultimately, facial recognition results are presented. Although the application scope of facial recognition technology is continuously expanding, its widespread use has also triggered critical discussions on privacy and ethical issues. Striking a balance between technological advancement and individual privacy rights has become an urgent challenge for society.

2.2. Meaning of face recognition

Facial recognition technology has important social and economic significance.

The application of facial recognition in public security can effectively enhance the security of surveillance systems, allowing for identity verification, tracking fugitives, patrolling and inspection, and risk warning[1].

Facial recognition technology in the fields of financial payment and mobile phone unlocking has provided convenient identity verification methods, replacing traditional password and fingerprint identification and improving security and user experience. According to official data from Ali-pay, the accuracy rate of the facial recognition payment function in the international open test was as high as 99.5%[2]. However, there are still certain security risks, and the safety risks of facial recognition present progressive, concealed, coupled, and uncertain characteristics[3]. How to reduce the risks is also one of the important research directions of facial recognition technology at present.

All in all, face recognition technology offers significant benefits but also poses important ethical challenges that need to be carefully managed.

2.3. Development status of face recognition

Currently, facial recognition has many practical applications in various fields, including but not limited to security surveillance, financial services, and attendance systems. There are also many software applications for facial recognition on the market, such as FaceFirst, SenseTime, CompreFace, Trueface, and Clarifai. However, the methods for facial recognition are relatively limited. The current mainstream methods in face recognition include various technologies such as geometric features, PCA, elastic graph matching, LHD, and SVM. The main development direction of face recognition in the future may include face databases for complex moving objects, face databases for similar faces, and face databases for fuzzy features[4]. These directions place higher demands on the speed and accuracy of computer face recognition in extreme conditions such as high density, environmental interference, and similar features.

3. Mainstream method of face recognition

3.1. Geometric feature extraction

The extraction of geometric features commences with the application of morphological filtering to preprocess the image, effectively eliminating noise.. Subsequently, binary processing is performed on the image to facilitate the extraction of feature points, which significantly mitigates the effects of varying illumination levels. In the next phase, eye positions within a facial image are identified using cross-correlation transformation. This algorithm assesses potential circular structures present in the image and selects a few brightest spots as candidate eye locations[5]. Given that in frontal face images, the vertical coordinates for both eyes are nearly identical, this further validates their positions. Next, based on proportional relationships among facial features, areas corresponding to these organs are defined, adjustments to these coefficients and displacements may be necessary depending on specific scenarios. Feature points are then identified through a projection map derived from region points. If there is a lack of clarity or indistinct features, modifications to the position and size of organ regions will be made until an optimal point projection map is achieved—this constitutes a parameter tuning phase. Once all steps have been completed, these feature points can form a feature vector that can be compared against those stored in a sample database to determine recognition outcomes.

This approach demonstrates remarkable effectiveness even amidst significant interference or distortions resulting from film degradation in test images. Additionally, due to its use of feature vectors that maintain invariance under size changes, rotation, and translation shifts, it exhibits strong adaptability to variations in facial images. Nonetheless, one notable limitation exists: images must be captured from a frontal viewpoint for accurate results.

3.2. Wavelet transform and PCA of artificial neural network elastic graph matching

The PCA-based face recognition method using wavelet transform and artificial neural network is sensitive to lighting and insensitive to position and pose, so only lighting compensation preprocessing is performed. The grayscale histogram is used to quantify the color values of the image, and the mean and variance of the image are defined. After lighting compensation, noise is removed. However, after lighting compensation, the difference in skin tone is partially compensated for, along with the difference in lighting, which is also a negative impact of lighting compensation. Then, the computer decomposes the image into subimages of different frequencies, which helps with feature extraction. These features include but are not limited to skin texture and facial shape. Then, PCA dimensionality reduction technology is used to transform the data and select features. It transforms the original data into a new coordinate system through linear transformation, so that the first principal component direction has the maximum variance, the second principal component direction has the second-highest variance, and so on[6]. These features can differentiate faces as much as possible while reducing the amount of unnecessary computation. The extracted features can be further used to construct a feature face for subsequent face recognition and verification. Then, the data set is trained, evaluated, and optimized by the artificial neural network continuously until satisfactory recognition results are achieved. Finally, the unidentified face image is input into the trained neural network after the same preprocessing and feature extraction steps, and the output result is obtained.

This method can tolerate certain degree of position and pose changes and can even handle partially damaged sample images. The operation of wavelet transform greatly reduces the dimensionality of the image, significantly reducing the training time of the neural network. However, the feature space of the image is extracted based on the entire face vector of the face database, it brings certain difficulties in expanding the face image database. When new face images are added to the database, the feature space needs to be re-found and the feature vector needs to be extracted, and the neural network needs to be re-trained.

3.3. Elastic map matching

The elastic graph matching method requires the initial identification of key facial feature points, which can be selected manually or detected automatically. Typical feature points include the inner and outer corners of the eyes, the tip of the nose, and the corners of the mouth. Subsequently, a graph is constructed based on these feature points, with each node representing a facial feature point and containing a local feature descriptor (such as Gabor wavelet coefficients)[7]. Then, local features are extracted using techniques like Gabor filtering to form a comprehensive feature vector. After that, the feature map of the target individual's face is matched against model maps stored in a database. Optimization algorithms such as dynamic programming and simulated annealing are employed to minimize discrepancies between feature points and vectors in order to identify an optimal match. Ultimately, based on the results of this graph matching process, one can determine both the identity category of the recognized face and its associated confidence level. The advantages of elastic graph matching include robust performance against local deformations and rotations while effectively accommodating variations in facial expressions and postures. However, this methodology also has certain drawbacks: it entails high computational complexity, requires substantial manual annotation efforts, and faces challenges when addressing global deformations.

3.4. Linear Discriminant Hashing

The enhanced line segment Hausdorff distance (LHD) method commences with preprocessing steps such as grayscale conversion, histogram equalization, and normalization to enhance the resistance of the processed images to variations in lighting, pose, and other interferences.. Subsequently, a face detection algorithm is employed to identify the facial region within the image, options for this step include Haar feature cascade classifiers, HOG+SVM methods, or MTCNN[8]. Following detection, active shape model (ASM) technology is utilized to align the detected faces into a standardized pose and size for further processing. Then, the extraction of line segment edge maps is performed by integrating ASM techniques to achieve more accurate facial alignment results. Next, different weights are assigned to line segments based on their significance in various regions of the face. Subsequently,, the improved LHD is computed between two facial edge maps derived from these segments, this calculated LHD value facilitates comparison of facial features. By comparing this LHD value with that of known faces, it becomes possible to ascertain the identity of an unidentified individual. Ultimately, recognition outcomes—including identity confirmation—alongside face categories and confidence levels are reported. In contrast to traditional methods, the improved LHD approach integrates ASM-derived alignment results for extracting line edge maps while assigning varying weights according to the regional importance within the face structure. These enhancements significantly bolster both accuracy and robustness in face recognition tasks, however, they introduce computational complexity that may hinder efficiency. Furthermore, due to its limited sensitivity towards expressions and poses, this methodology may not be suitable for detailed recognition involving such attributes.

3.5. Support Vector Machines

The support vector machine (SVM) methodology initially involves examining and removing outliers, and then only using the normal samples for model training. This process effectively embodies the concept of outlier detection. Following this, one-class SVM attempts to estimate the boundary delineates positive sample data. In contrast to approaches that directly assess sample distributions, this method requires fewer training samples and can disregard the actual distribution of the data. The decision boundary is then established based on identified support vectors, in one-class SVM, those points nearest to this boundary are designated as support vectors[9]. Subsequently, a kernel function is selected to facilitate mapping data into a high-dimensional space, thereby simplifying the identification of boundaries that differentiate positive samples from outliers within that space. The next step involves solving a multi-objective optimization problem aimed at minimizing the distances from samples to their decision boundary while maximizing distances from this boundary to its nearest sample point. Typically, this challenge is reformulated into a dual problem through Lagrange multipliers for resolution. The SVM approach demonstrates robust capabilities in processing high-dimensional datasets and is particularly well-suited for applications involving complex data such as facial images. It exhibits strong generalization abilities and effectively mitigates overfitting risks while remaining resilient against variations in lighting conditions, poses, and expressions. Nevertheless, it presents challenges regarding parameter selection and handling nonlinear problems; specifically, parameter tuning can be intricate and achieving satisfactory outcomes in more complicated nonlinear scenarios remains challenging.

4. Conclusion

The current mainstream face recognition methods encompass various technologies such as geometric features, the combination of wavelet transform and PCA of artificial neural network, elastic graph matching, LHD, and SVM. The face recognition method relying on geometric features extracts features based on the position and geometric relationship of feature points like eyes, nose, and mouth in the face image. It is sensitive to pose and expression changes but can be highly effective when the variables are controlled. This method is suitable for initial recognition and situations with limited resources and depends on the keypoint information of the face. It is robust to local changes. The method using wavelet transform and PCA of artificial neural networks extracts the main features of the face image through principal component analysis for feature extraction and constructs the feature face space. It employs neural networks for recognition decision training. This method is sensitive to lighting changes but can tolerate certain pose and expression changes. By reducing dimensionality, it improves computational efficiency and is suitable for large-scale recognition systems. The face recognition method based on elastic graph matching uses elastic matching technology to align and compare face images for feature comparison. It can adapt to local deformations and distortions. It has some tolerance for pose changes but has a large computational volume. This method can handle facial local deformations and is suitable for recognition of facial expressions and poses. It precisely matches contours and local features and is suitable for object recognition of non-rigid objects. The LHD face recognition method dispenses with common face feature extraction and uses shape description distance measurement, especially the Hausdorff distance of curves or contours. It is robust to noise and image resolution changes. The face recognition method based on support vector machines is a supervised learning algorithm used for classification and regression analysis. In face recognition, SVM can be used to find the optimal decision boundary in the feature space. This method is suitable for medium-sized recognition tasks and has good classification performance. The general flow of different methods is similar, yet they have optimized the basic flow to a certain extent. In actual applications, the most appropriate method should be selected based on specific needs and conditions.

5. Prospect of the future

The potential of facial recognition technology to enhance the quality of human life is substantial. However, significant opportunities for further progress remain. Firstly, algorithms need to be further refined and enhanced to effectively handle a variety of complex conditions, including motion dynamics, lighting variations, diverse viewing angles, and occlusion situations. Through in-depth research on deep learning and computer vision methods, both the accuracy and speed of recognition can be progressively improved. Secondly, application domains should be expanded, customized facial recognition techniques need to be developed for various sectors such as finance, security, and social media to achieve optimal outcomes. Lastly, there is an urgent necessity for robust privacy protection measures. By integrating decentralized technologies with advanced encryption strategies, the security of user data can be enhanced to prevent misuse while ensuring that the evolution of facial recognition technology simultaneously safeguards fundamental user rights[10].

References

[1]. Fu Feng, Wang Hanwei. Practical Inspection and Theoretical Response to Face Recognition as Criminal Evidence -- Taking Criminal Judgment Documents from 2013 to 2022 as research Samples [J]. Evidence Science, 24,32(04):451-467.

[2]. Huang Chunmei. Third Party payment System based on Face recognition [D]. Yangzhou University,2019.

[3]. Ma Shishun. Civil face recognition security risk prevention and control research [D]. The Chinese people's public security university, 2023. The DOI: 10.27634 /, dc nki. Gzrgu. 2023.000016.

[4]. LIANG Yuan-Kai. Characteristics and Direction of face database development [J]. Computer and Networks,2021,47(04):64-67.

[5]. Zhang Jun, He Xin, Li Jiegu. Face recognition method based on facial geometric feature point extraction [J]. Infrared and Laser Engineering,1999,(04):40-43.

[6]. HU L. Research on PCA face recognition method based on wavelet transform and artificial neural network [D]. Soochow University,2002.

[7]. ZHANG Hai-long. Research on elastic Tractograph matching algorithm for face recognition [D]. Northeastern University,2008.

[8]. DU Cheng, SU Guangda, Lin Xinggang, et al. Face recognition method based on improved line segment Hausdorff distance [J]. Photoelectron · Laser,2005,(01):89-93.

[9]. WANG Yixin. Research on face detection and recognition system based on Support Vector Machine technology [D]. National University of Defense Technology,2003.

[10]. Zhao. Theory of legal regulation of the face recognition [D]. Shanxi university of finance and economics, 2023, DOI: 10.27283 /, dc nki. GSXCC. 2023.000601.

Cite this article

Yao,X. (2024). Research on Artificial Intelligence and Face Recognition. Applied and Computational Engineering,112,79-84.

Data availability

The datasets used and/or analyzed during the current study will be available from the authors upon reasonable request.

Disclaimer/Publisher's Note

The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of EWA Publishing and/or the editor(s). EWA Publishing and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

About volume

Volume title: Proceedings of the 5th International Conference on Signal Processing and Machine Learning

ISBN：978-1-83558-747-8(Print) / 978-1-83558-748-5(Online)

Editor：Stavros Shiaeles

Conference website: https://2025.confspml.org/

Conference date: 12 January 2025

Series: Applied and Computational Engineering

Volume number: Vol.112

ISSN：2755-2721(Print) / 2755-273X(Online)

© 2024 by the author(s). Licensee EWA Publishing, Oxford, UK. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license. Authors who publish this series agree to the following terms:
1. Authors retain copyright and grant the series right of first publication with the work simultaneously licensed under a Creative Commons Attribution License that allows others to share the work with an acknowledgment of the work's authorship and initial publication in this series.
2. Authors are able to enter into separate, additional contractual arrangements for the non-exclusive distribution of the series's published version of the work (e.g., post it to an institutional repository or publish it in a book), with an acknowledgment of its initial publication in this series.
3. Authors are permitted and encouraged to post their work online (e.g., in institutional repositories or on their website) prior to and during the submission process, as it can lead to productive exchanges, as well as earlier and greater citation of published work (See Open access policy for details).