
Enhancing Small Target Detection in Aerial Imagery Based on YOLOv8
- 1 School of Integrated Circuits, Guangdong University of Technology, GuangZhou 510006, China; Advanced Manufacturing School, Guangdong Polytechnic of Environmental Protection Engineering, GuangZhou 510500, China
- 2 School of Integrated Circuits, Guangdong University of Technology, GuangZhou 510006, China
* Author to whom correspondence should be addressed.
Abstract
Among the numerous deep learning-based object detection algorithms, the YOLO (You Only Look Once) series has become a preferred choice for various detection scenarios due to its fast detection speed, high accuracy, and strong adaptability. However, its performance remains suboptimal when applied to unconventional images, such as aerial images, which often feature complex content and small detection targets. To overcome this limitation, a novel model called YOLOv8-TDE (Tiny Detection Enhanced) is introduced in this work. First, to more effectively differentiate between key features and noise and to comprehensively capture multi-scale target features, the feature extraction network is improved by employing pooling convolution kernels of various sizes and incorporating a lightweight attention mechanism. Second, the GeoIoU (Geometric Intersection over Union) loss function is introduced to reduce sensitivity to aspect ratio and center point distance, addressing the limitations of the original CIoU loss function, which is overly sensitive to small changes in these parameters. Finally, a novel detection head, the FAD Head, is proposed to dynamically generate detection head parameters based on the input image's features, enabling better feature extraction for targets of different sizes in complex scenes. This enhancement improves the model's adaptability across various scenarios, as well as detection accuracy and stability. Experiments on the VisDrone2019 dataset demonstrate that the proposed model outperforms the original YOLOv8n, achieving a 15.5% improvement in mAP@0.5 and a 17.6% improvement in mAP@0.5:0.95.
Keywords
YOLOv8, UAV, small target detection, attention mechanism
[1]. H. Shen, "Development and application of drones," Science News, vol. 25, no. 1, pp. 42–45, 2023. (in Chinese)
[2]. S. Zhong and L. Wang, "A review of object detection technology in UAV aerial images," Advances in Laser and Optoelectronics [J/OL], pp. 1–32, Dec. 17, 2024. (in Chinese)
[3]. K. Mo, L. Chu, and X. Zhang, "DRAL: Deep Reinforcement Adaptive Learning for Multi-UAVs Navigation in Unknown Indoor Environment," arXiv preprint arXiv:2409.03930, 2024.
[4]. Y. Liu, J. Wu, S. Sun, X. Wang, and H. Wang, "Contrastive Self-supervised Learning in Recommender Systems: A Survey," arXiv preprint arXiv:2205.01593, 2022.
[5]. T. Li, A. K. Sahu, A. Talwalkar, and V. Smith, "Federated Learning: Challenges, Methods, and Future Directions," IEEE Signal Processing Magazine, vol. 37, no. 3, pp. 50–60, May 2020, doi: 10.1109/MSP.2020.2975749.
[6]. T. Baltrušaitis, C. Ahuja, and L. P. Morency, "Multimodal Machine Learning: A Survey and Taxonomy," IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 41, no. 2, pp. 423–443, Feb. 2019.
[7]. J. Lü and Q. Fu, "A joint algorithm based on improved active learning and self-training," Journal of Beijing Normal University (Natural Science Edition), vol. 58, no. 1, pp. 25–32, 2022. (in Chinese)
[8]. P. Sermanet, D. Eigen, X. Zhang, M. Mathieu, R. Fergus, and Y. LeCun, "Overfeat: Integrated recognition, localization and detection using convolutional networks," arXiv preprint arXiv:1312.6229, 2013.
[9]. W. Liu, D. Anguelov, D. Erhan, C. Szegedy, S. Reed, and C.-Y. Fu, "SSD: Single shot multi-box detector," in Computer Vision – ECCV 2016, Springer, 2016, pp. 21–37.
[10]. R. Girshick, J. Donahue, T. Darrell, and J. Malik, "Rich feature hierarchies for accurate object detection and semantic segmentation," in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2014, pp. 580–587.
[11]. K. He, X. Zhang, S. Ren, and J. Sun, "Spatial Pyramid Pooling in Deep Convolutional Networks for Visual Recognition," IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 37, no. 9, pp. 1904–1916, Sep. 2015.
[12]. A. Betti and M. Tesei, "YOLO-S: A lightweight and accurate YOLO-like network for small target selection in aerial imagery," Sensors, vol. 23, no. 4, pp. 1865–1865, 2023.
[13]. H. Yang and H. Li, "A single-aggregation YOLO algorithm for detecting small aerial targets," Foreign Electronics Measurement Technology, vol. 42, no. 4, pp. 131–140, 2023. (in Chinese)
[14]. F. Wang, H. Wang, Z. Qin, and M. Zhang, "UAV target detection algorithm based on improved YOLOv8," IEEE Access, vol. 11, pp. 116534–116544, 2023.
[15]. M. Ma and H. Pang, "SP-YOLOv8s: An improved YOLOv8s model for remote sensing image tiny object detection," Applied Sciences, vol. 13, no. 18, p. 8161, 2023, doi: 10.3390/app13148161.
[16]. Y. Zhu and Y. Zhang, "SEP-YOLO: An improved YOLOv8-based road target detection algorithm," Computer Applications and Software, pp. 1–8. (in Chinese)
[17]. J. Redmon, S. Divvala, R. Girshick, and A. Farhadi, "You only look once: Unified real-time object detection," in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2016, pp. 779–788.
[18]. J. Zhou and J. Wang, "A review of YOLO object detection algorithms," Journal of Changzhou Institute of Technology, vol. 36, no. 1, pp. 18–23, 88, 2023. (in Chinese)
[19]. B. J. Xiao, M. Nguyen, and W. Q. Yan, "Fruit ripeness identification using YOLOv8 model," Multimedia Tools and Applications, pp. 1–18, 2023.
[20]. P. Liu, P. Dollár, K. He, R. Girshick, and P. Dollár, "Path aggregation network for instance segmentation," arXiv preprint arXiv:1803.01534, 2018.
[21]. Z. Zheng, P. Wang, W. Liu, J. Li, R. Ye, and D. Ren, "Enhancing Geometric Factors in Model Learning and Inference for Object Detection and Instance Segmentation," in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2020, pp. 9937–9946, doi: 10.1109/CVPR42600.2020.00996.
[22]. S. Ma and Y. Xiong, "MPDIoU: A loss for efficient and accurate bounding box regression," arXiv preprint arXiv:2307.07662, 2023.
[23]. X. Dai, Y. Chen, B. Xiao, D. Chen, L. Yuan, and L. Zhang, "Dynamic Head: Unifying Object Detection Heads with Attentions," in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2021, pp. 7373–7382, doi: 10.1109/CVPR46437.2021.00729.
[24]. Y. Chen, X. Dai, M. Liu, D. Chen, L. Yuan, and Z. Liu, "Dynamic ReLU," arXiv preprint arXiv:2003.10027, 2020. [Online]. Available: https://arxiv.org/abs/2003.10027.
[25]. Z. Zhu, L. Wen, D. Du, X. Bian, H. Ling, and Q. Hu, "Detection and Tracking Meet Drones Challenge," in Proceedings of the IEEE/CVF International Conference on Computer Vision Workshops (ICCVW), 2019, pp. 1901–1910, doi: 10.1109/ICCVW.2019.00244.
Cite this article
Huang,C.;Tan,B. (2025). Enhancing Small Target Detection in Aerial Imagery Based on YOLOv8. Applied and Computational Engineering,108,209-223.
Data availability
The datasets used and/or analyzed during the current study will be available from the authors upon reasonable request.
Disclaimer/Publisher's Note
The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of EWA Publishing and/or the editor(s). EWA Publishing and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.
About volume
Volume title: Proceedings of the 5th International Conference on Signal Processing and Machine Learning
© 2024 by the author(s). Licensee EWA Publishing, Oxford, UK. This article is an open access article distributed under the terms and
conditions of the Creative Commons Attribution (CC BY) license. Authors who
publish this series agree to the following terms:
1. Authors retain copyright and grant the series right of first publication with the work simultaneously licensed under a Creative Commons
Attribution License that allows others to share the work with an acknowledgment of the work's authorship and initial publication in this
series.
2. Authors are able to enter into separate, additional contractual arrangements for the non-exclusive distribution of the series's published
version of the work (e.g., post it to an institutional repository or publish it in a book), with an acknowledgment of its initial
publication in this series.
3. Authors are permitted and encouraged to post their work online (e.g., in institutional repositories or on their website) prior to and
during the submission process, as it can lead to productive exchanges, as well as earlier and greater citation of published work (See
Open access policy for details).