
License plate Chinese character recognition based on ViT model
- 1 South China University of Technology
* Author to whom correspondence should be addressed.
Abstract
Transformer applications have been widely used in the computer vision field. Many related literatures show that the advantages of the model such as increased receptive field and globality are gradually emerging in image processing. However, with the popularity of the transformer, whether it can compete with the convolutional neural network (CNN) in terms of performance is still questionable and remains to be further studied. This paper will use the most basic structural model in the visual transformer (ViT) to classify and identify Chinese characters that are frequently used in the field of transportation and logistics and compare them with two classical CNN models. The results demonstrate that the performance of the transformer is obviously better than that of the traditional CNN structure, and the final accuracy of character recognition is higher than that of CNN, up to 98.66 %. It fully shows the infinite potential and excellent performance of the transformer in the area of computer vision and has high reliability and generalization ability.
Keywords
Chinese characters, vision transformer, convolutional neural network.
[1]. DosoViTskiy A, Beyer L, Kolesnikov A, et al. (2021). "An Image is Worth 16x16 Words: Transformers for Image Recognition at Scale." International Conference on Learning Representations.
[2]. Xiangping Wu. (2021). "Research on key technologies of image text recognition." Harbin Institute of Technology.
[3]. Technicolor T, Related S , Technicolor T , et al. (2017). " ImageNet Classification with Deep Convolutional Neural Networks [50]." Communications of the ACM, 60(6), 84-90.
[4]. K. He, X. Zhang, S. Ren and J. Sun, (2016) "Deep Residual Learning for Image Recognition." 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 770-778.
[5]. Rosenblatt, F. (1957). "The perceptron, a perceiving and recognizing automaton Project Para." Cornell Aeronautical Laboratory.
[6]. Ruwei Dai,Chenglin Liu and Baihua Xiao. (2007). "Chinese character recognition: history, status and prospects." Frontiers of Computer Science in China(2).
[7]. R. Messina and J. Louradour, (2015). "Segmentation-free handwritten Chinese text recognition with LSTM-RNN," 2015 13th International Conference on Document Analysis and Recognition (ICDAR), pp. 171-175.
[8]. Y. Lecun, L. Bottou, Y. Bengio and P. Haffner, (November, 1998). "Gradient-based learning applied to document recognition," in Proceedings of the IEEE, vol. 86, no. 11, pp. 2278-2324.
[9]. Zeng-qiang, M. (2010). "License Plate Character Recognition Based on Convolutional Neural Network LeNet-5." Computer Simulation.
[10]. Karen, Simonyan., Andrew, Zisserman. (2015). "Very Deep Convolutional Networks for Large-Scale Image Recognition."
[11]. Zhengqiang Liu. (2016). Application of deep learning algorithm in license plate recognition system.University of Electronic Science and Technology of China,MA thesis.
Cite this article
Zhang,X. (2023). License plate Chinese character recognition based on ViT model. Theoretical and Natural Science,19,1-5.
Data availability
The datasets used and/or analyzed during the current study will be available from the authors upon reasonable request.
Disclaimer/Publisher's Note
The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of EWA Publishing and/or the editor(s). EWA Publishing and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.
About volume
Volume title: Proceedings of the 2nd International Conference on Computing Innovation and Applied Physics
© 2024 by the author(s). Licensee EWA Publishing, Oxford, UK. This article is an open access article distributed under the terms and
conditions of the Creative Commons Attribution (CC BY) license. Authors who
publish this series agree to the following terms:
1. Authors retain copyright and grant the series right of first publication with the work simultaneously licensed under a Creative Commons
Attribution License that allows others to share the work with an acknowledgment of the work's authorship and initial publication in this
series.
2. Authors are able to enter into separate, additional contractual arrangements for the non-exclusive distribution of the series's published
version of the work (e.g., post it to an institutional repository or publish it in a book), with an acknowledgment of its initial
publication in this series.
3. Authors are permitted and encouraged to post their work online (e.g., in institutional repositories or on their website) prior to and
during the submission process, as it can lead to productive exchanges, as well as earlier and greater citation of published work (See
Open access policy for details).