License plate Chinese character recognition based on ViT model

Xiaoyu Zhang

doi:10.54254/2753-8818/19/20230458

Research Article

Open access

Published on 8 December 2023

Download pdf

Zhang,X. (2023). License plate Chinese character recognition based on ViT model. Theoretical and Natural Science,19,1-5.

Export citation

License plate Chinese character recognition based on ViT model

Xiaoyu Zhang *^,1,

¹ South China University of Technology

* Author to whom correspondence should be addressed.

https://doi.org/10.54254/2753-8818/19/20230458

Abstract

Transformer applications have been widely used in the computer vision field. Many related literatures show that the advantages of the model such as increased receptive field and globality are gradually emerging in image processing. However, with the popularity of the transformer, whether it can compete with the convolutional neural network (CNN) in terms of performance is still questionable and remains to be further studied. This paper will use the most basic structural model in the visual transformer (ViT) to classify and identify Chinese characters that are frequently used in the field of transportation and logistics and compare them with two classical CNN models. The results demonstrate that the performance of the transformer is obviously better than that of the traditional CNN structure, and the final accuracy of character recognition is higher than that of CNN, up to 98.66 %. It fully shows the infinite potential and excellent performance of the transformer in the area of computer vision and has high reliability and generalization ability.

Keywords

Chinese characters, vision transformer, convolutional neural network.

View pdf

References

[1]. DosoViTskiy A, Beyer L, Kolesnikov A, et al. (2021). "An Image is Worth 16x16 Words: Transformers for Image Recognition at Scale." International Conference on Learning Representations.

[2]. Xiangping Wu. (2021). "Research on key technologies of image text recognition." Harbin Institute of Technology.

[3]. Technicolor T, Related S , Technicolor T , et al. (2017). " ImageNet Classification with Deep Convolutional Neural Networks [50]." Communications of the ACM, 60(6), 84-90.

[4]. K. He, X. Zhang, S. Ren and J. Sun, (2016) "Deep Residual Learning for Image Recognition." 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 770-778.

[5]. Rosenblatt, F. (1957). "The perceptron, a perceiving and recognizing automaton Project Para." Cornell Aeronautical Laboratory.

[6]. Ruwei Dai,Chenglin Liu and Baihua Xiao. (2007). "Chinese character recognition: history, status and prospects." Frontiers of Computer Science in China(2).

[7]. R. Messina and J. Louradour, (2015). "Segmentation-free handwritten Chinese text recognition with LSTM-RNN," 2015 13th International Conference on Document Analysis and Recognition (ICDAR), pp. 171-175.

[8]. Y. Lecun, L. Bottou, Y. Bengio and P. Haffner, (November, 1998). "Gradient-based learning applied to document recognition," in Proceedings of the IEEE, vol. 86, no. 11, pp. 2278-2324.

[9]. Zeng-qiang, M. (2010). "License Plate Character Recognition Based on Convolutional Neural Network LeNet-5." Computer Simulation.

[10]. Karen, Simonyan., Andrew, Zisserman. (2015). "Very Deep Convolutional Networks for Large-Scale Image Recognition."

[11]. Zhengqiang Liu. (2016). Application of deep learning algorithm in license plate recognition system.University of Electronic Science and Technology of China,MA thesis.

Cite this article

Zhang,X. (2023). License plate Chinese character recognition based on ViT model. Theoretical and Natural Science,19,1-5.

Data availability

The datasets used and/or analyzed during the current study will be available from the authors upon reasonable request.

Disclaimer/Publisher's Note

The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of EWA Publishing and/or the editor(s). EWA Publishing and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

About volume

Volume title: Proceedings of the 2nd International Conference on Computing Innovation and Applied Physics

Conference website: https://www.confciap.org/

ISBN：978-1-83558-203-9(Print) / 978-1-83558-204-6(Online)

Conference date: 25 March 2023

Editor：Marwan Omar, Roman Bauer

Series: Theoretical and Natural Science

Volume number: Vol.19

ISSN：2753-8818(Print) / 2753-8826(Online)

© 2024 by the author(s). Licensee EWA Publishing, Oxford, UK. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license. Authors who publish this series agree to the following terms:
1. Authors retain copyright and grant the series right of first publication with the work simultaneously licensed under a Creative Commons Attribution License that allows others to share the work with an acknowledgment of the work's authorship and initial publication in this series.
2. Authors are able to enter into separate, additional contractual arrangements for the non-exclusive distribution of the series's published version of the work (e.g., post it to an institutional repository or publish it in a book), with an acknowledgment of its initial publication in this series.
3. Authors are permitted and encouraged to post their work online (e.g., in institutional repositories or on their website) prior to and during the submission process, as it can lead to productive exchanges, as well as earlier and greater citation of published work (See Open access policy for details).