Marine Life Identification System Based on Deep Learning

Shengrong Wang

doi:10.54254/2755-2721/106/20241293

1. Introduction

Marine ecological security governance has become a common focus of international attention. On August 24, 2023, the Japanese government launched the discharge of contaminated water from the Fukushima nuclear disaster into the sea, raising global concerns about marine ecological security and underscoring the significant challenges in its governance. Currently, research progress on marine life is mainly focused on the application of computer vision and machine learning technologies, which can process and analyze vast amounts of marine life image data. However, existing systems still have some research gaps, including insufficient recognition accuracy for organisms in the deep sea and complex environments, as well as timely updates and accurate classification issues for new species. Therefore, the research theme of this article is to develop an advanced marine biological intelligent recognition system, by integrating deep learning and data analysis technologies, to improve the accuracy of automatic identification and classification of marine organisms. This research focus on enhancing the recognition accuracy of marine organisms in complex marine environments, especially in deep-sea areas. The system can obtain existing images of marine life from devices such as underwater cameras and unmanned submersibles, and it can also get images in real-time through connected live cameras. This paper trained the YOLOv5 model to recognize and classify marine life. Finally, Pyside6 is used to create a user graphical interface for real-time data processing and visual presentation. This study will significantly enhance the intelligent recognition capabilities of marine life, having a profound impact on scientific research, fisheries management, and environmental protection. It will provide more accurate biological data, supporting long-term monitoring and conservation of marine ecosystems, while also offering valuable experience and data support for the development of future technologies.

2. Marine Life Identification System

2.1. Concept of Marine Life Identification System

The marine life identification system utilizes sensors, cameras, and machine learning technologies to identify marine organisms. By capturing images, the system analyzes features such as shape, texture, and other characteristics to confirm the species [1]. Applications include ecological monitoring, fisheries management, protected area surveillance, and scientific research, aiding researchers in understanding species distribution, behavioral patterns, and the impact of environmental changes on organisms. The dataset used in this study consists of 4,670 images, with 4,480 images allocated for training, 127 for validation, and 63 for testing.

The images in the dataset cover a variety of marine organisms, such as fish, jellyfish, penguins, dolphins, sharks, starfish, and stingrays, and each category instance is accurately labeled to ensure the accuracy of the training process.

Due to the size restrictions of the YOLOv5 algorithm on input images, it is necessary to resize all images to the same dimensions. To minimize distortion of the images while maintaining detection accuracy as much as possible, the images were resized to 640x640 pixels while preserving their original aspect ratios. Additionally, data augmentation techniques such as random rotation, scaling, cropping, and color transformations were applied to enhance the model’s generalization and robustness, expanding the dataset and reducing the risk of overfitting [2].

2.2. Implementation Significance

(1) Ecological Protection and Biodiversity Conservation

Species monitoring: An intelligent identification system can assist scientists in real-time monitoring of the types and quantities of marine life. By tracking the distribution and changes of specific species, researchers can better understand the ecological needs and living conditions of these species, thereby formulating targeted conservation measures [3].

Environmental Change Assessment: The system can detect changes in biological communities, thereby aiding in the evaluation of the health of marine environments. This is crucial for identifying and addressing issues such as marine pollution, climate change, and habitat destruction.

(2) Fisheries Management and Sustainable Development

Fishery monitoring: The intelligent identification system identifies and records catches in real-time aboard fishing boats, helping fisheries managers monitor fishery resource harvesting. This helps prevent overfishing and protects fish populations.

Combating illegal fishing: The system can be used in conjunction with satellite positioning technology to identify and track illegal fishing activities, thereby helping law enforcement agencies maintain the sustainability of marine resources.

(3) Scientific Research and Data Collection

Behavioral research: By analyzing the behavior and activity patterns of marine life, scientists can obtain valuable data on animal migration, reproduction, and feeding habits. This helps to deeply understand the dynamics and complexity of marine ecosystems [4].

Data storage and sharing: The intelligent recognition system can store and process biological data on a large scale and share it with researchers worldwide through the network, promoting interdisciplinary and international cooperation.

(4) Public Education and Awareness Raising

Science popularization: Through real-time data and visual information provided by the marine biological intelligent identification system, the public can more easily understand and pay attention to marine life and their habitats, enhancing environmental awareness.

Virtual display: Some systems can also be combined with Virtual Reality (VR) or Augmented Reality (AR) technologies to create immersive educational experiences, attracting more people to participate in ocean conservation activities.

(5) Technological Innovation and Application

Cross-domain applications: The advancement of intelligent recognition technology can drive the development of other fields, such as robotics, artificial intelligence, computer vision, etc. Breakthroughs in these technologies contribute to improving the accuracy and efficiency of systems [5].

Business opportunities: Developing and applying intelligent recognition systems can bring new business opportunities to related enterprises, such as equipment manufacturing, data analysis services, and consulting.

3. System Design Ideas

3.1. Language Environment

During the development of the marine organism identification system, the main language environment and libraries include:

(1) The programming language: Python with its extensive third-party libraries, significantly enhances development efficiency.

(2) Image processing: OpenCV is used for image processing and computer vision tasks.

(3) Object Detection: Yolov5 algorithm is one of the most widely used object detection algorithms currently, which is based on deep learning technology. It incorporates modules such as Feature Pyramid Networks and Spatial Pyramid Pooling (SPP) on top of convolutional neural networks, thus achieving a balance between high accuracy and fast detection speed [6].

(4) User Graphical Interface: The system uses PySide6 as the GUI library, providing a set of intuitive and user-friendly interface.

(5) Data storage and management: SQLite is used for database operations and data storage [7].

3.2. Model Advantages

The core idea of the YOLOv5 algorithm is to transform the object detection problem into a regression problem, replacing Anchor boxes by directly predicting the coordinates of the center points of objects. Additionally, YOLOv5 employs the SPP (Spatial Pyramid Pooling) feature extraction method, which can effectively extract multi-scale features without increasing computational load, enhancing detection performance. The YOLOv5 network structure consists of Input, Backbone, Neck, and Prediction components. The Input part of YOLOv5 is the input end of the network, using Mosaic data augmentation to randomly crop input data and then splice it together. The Backbone is the network part of YOLOv5 that extracts features, and the feature extraction capability directly affects the overall network performance. During the feature extraction phase, YOLOv5 uses the CSPNet (Cross Stage Partial Network) structure, which divides the input feature map into two parts, one of which is processed through a series of convolutional layers, while the other is directly downsampled. Finally, these two feature maps are fused. This design gives the network stronger nonlinear expression capabilities, allowing it to better handle complex backgrounds and diverse objects in object detection tasks. In the Neck phase, the model uses continuous convolutional kernel C3 structural blocks to fuse feature maps. In the Prediction phase, the model predicts the center coordinates and size information of the objects based on the feature maps [8].

The specific advantages of YOLOv5 include:

(1) Fast speed: YOLOv5 can achieve a detection speed of 140FPS, making it suitable for real-time detection.

(2) High accuracy: YOLOv5 also shows improvement in accuracy, with an mAP of 83.8%, outperforming YOLOv3 on some datasets, especially in the detection of small objects [9].

(3) Easy to train: YOLOv5 embeds the calculation of initial anchor values into the code, adaptively calculating the optimal initial anchor values for different training sets during each training, making training simpler.

(4) Lightweight: The YOLOv5 model is smaller and has fewer parameters, with a model size of only 27MB, making it suitable for mobile and embedded devices [10,11].

(5) Scalability: YOLOv5 is implemented based on PyTorch, making it easy to extend and customize for different application scenarios.

(6) Multi-task support: The YOLOv5 model supports multi-task learning, allowing multiple object detection tasks to be performed simultaneously.

(7) Visualization: The YOLOv5 model provides visualization tools to help users better understand the output results of the model.

3.3. System Architecture

When building the marine organism identification system, the core of the system design philosophy is to provide an efficient and user-friendly platform capable of real-time detection and identification of marine organisms. The system integrates modular design from modern software engineering with advanced technologies in computer vision and deep learning.

The system design employs a three-tier architecture pattern (control layer - processing layer - interface layer) to ensure high cohesion and low coupling among the various components. Each layer has its distinct responsibilities and functions, working together to deliver a comprehensive solution.

The processing layer is primarily responsible for backend computational tasks, including image preprocessing, model inference, and post-processing. By utilizing the YOLOv5 Detector class, the system leverages advanced deep learning models to analyze video streams or images and identify marine organisms. The design of this layer considers computational efficiency and accuracy to ensure the system can respond quickly and provide reliable detection results.

The interface layer focuses on user interaction, featuring a clear and intuitive interface that simplifies operations. The interface includes a real-time video display window, a detection results display area, and necessary control buttons, such as start and pause detection [12]. Users can interact with the system through these interface elements, watch detection results in real-time, and control the detection process. The purpose of this layer is to reduce the difficulty of use for the user and provide a pleasant user experience.

The control layer acts as a bridge, responsible for coordinating the interaction between the interface layer and the processing layer. The slot functions of the MainWindow class and other methods respond to user input, controlling the processing of video streams and the updating of visual results. This layer ensures direct response to user operations and converts user commands into control commands for the model and media processor.

3.4. Evaluation Metrics

Confidence is the range of confidence in interval estimates of population parameters in statistics.

Precision measures the proportion of samples predicted as positive cases that are actually positive cases, indicating how many of the samples identified as positive are indeed true positives.

Mean average precision (mAP) is an important indicator for measuring the performance of detection models, which considers accuracy and recall under different confidence thresholds. An improvement in mAP reflects enhanced overall detection performance. The model demonstrates good detection performance even under stricter Intersection over Union (IoU) thresholds.

F1 score is the harmonic mean of accuracy and recall, providing a balanced perspective on the accuracy and robustness of statistical models. By observing the F1 scores of the model at various confidence thresholds, we can gain a more comprehensive understanding of its performance.

IoU is an algorithm used to calculate the degree of overlap between different images, often employed in tasks such as object detection or semantic segmentation within the field of deep learning [13].

After obtaining the predicted box locations from the model output, the IoU can be computed between the output box and the true box (Ground Truth Boundary). At this point, the value of this box ranges from 0 to 1, where 0 indicates that the two boxes do not intersect, and 1 signifies that the two boxes are exactly coincident.

By changing the value in the input box below Confidence or IOU, the progress of the corresponding slider can be modified simultaneously. Conversely, changing the slider will also update the input box values. Adjustments to the Confidence or IoU values will synchronize with the model's configuration, altering the detection confidence threshold and IoU threshold.

4. Expected Implementation Effects

The YOLOv5 object detection model is utilized to train the dataset, and the PySide6 library is employed to develop the page display system. With deep learning algorithms, the system achieves object detection and recognition of marine life using images, videos, and cameras. Additionally, the system supports result visualization and the export of detection results in images or videos. The features supported by this system also include: importing and initializing the marine life training model; adjusting confidence scores and IOU thresholds; uploading, detecting, visualizing, and ending detection with cameras; and a list of detected targets with location information.

Here are some more detailed explanations.

Firstly, the system provides registration and login management functions based on SQLite. Users need to register through the registration interface when using it for the first time. After entering their username and password, the system will store this information in the SQLite database. After successful registration, users can log in by entering their username and password through the login interface. This design ensures the security of the system and provides the possibility for adding more personalized features in the future.

Secondly, on the main interface, the system supports image, video, real-time camera, and batch file input functions. Users can select the image or video for underwater target detection by clicking the corresponding button, or activate the camera for real-time detection. During the detection process, the system will display the detection results in real-time and store the detection records in the database.

In addition, the system also provides the function of one click replacement of models. Users can select different models for detection by clicking the "Change Model" button on the interface. At the same time, the dataset attached to the system can also be used to retrain the model to meet the detection needs of users in different scenarios.

Finally, to provide a more personalized user experience, the system here supports interface modification, where users can customize interface elements such as icons and text. For example, users can choose different styles of icons according to their preferences, and also modify the text description of the interface.

5. Conclusion

Marine organisms are an important part of the Earth's ecosystem, maintaining the balance of marine ecology. The marine organism identification system developed in this article is a system that uses advanced technology to identify and classify marine organisms. This system combines technologies such as artificial intelligence, computer vision, machine learning, and data analysis, with the goal of enhancing the accuracy and efficiency of marine organism identification. The system plays a significant role in marine research, ecological monitoring, fishery management, and environmental protection. However, challenges remain due to low brightness and blurred scenes in underwater observation environments, resulting in poor-quality photos and videos. Consequently, the accuracy of identification needs improvement, and fewer types of marine organisms are currently recognized. Future research will focus on in-depth analysis and exploration of the characteristics of the underwater environment, and the development of a more efficient, accurate, and comprehensive model for marine organism identification.

References

[1]. BANAN A, NASIRI A, TAHERI-GARAVAND A.Deep learningbased appearance features extraction for automated carp species identification[J]. Aquacultural Engineering, 2020, 89.

[2]. Sixuwuxian. (2024) Build a marine animal detection system using deep learning. https://www.zhihu.com/question/429203831.

[3]. Jiao L, Zhang F, Liu F, et al. A Survey of Deep Learning-Based Object Detection[J]. IEEE Access, 2019, 7: 128837–128868.

[4]. Hai H A, Qt B, Jl A, et al. A review on underwater autonomous environmental perception and target grasp, the challenge of robotic organism capture[J]. Ocean Engineering, 2020, 195:106644.

[5]. XU F Q, DONG P, WANG H B, et al. Intelligent detection and autonomous capture system of seafood based on underwater robot[J]. Journal of Beijing University of Aeronautics and Astronautics, 2019, 45(12): 2393-2402.

[6]. Shi T, Liu H X.A Safety Helmet Wearing Detection Method Based on Improved YOLOv5[J/OL]. Journal of Tianjin University of Technology. https://link.cnki.net/urlid/12.1374.N.20240914.1439.066

[7]. BestSongC.(2023)Marine life target detection system based on YOLOv8 model (PyTorch+Pyside6+YOLOv8 model). https://blog.csdn.net/sc1434404661.

[8]. QU Z, GAO L Y, WANG S Y, et al. An improved YOLOv5 method for large objects detection with multi-scale feature cross-layer fusion network[J]. Image and Vision Computing, 2022, 125: 1-12.

[9]. NIU H Q, OU O, RAO S S, et al. Small object detection method based on improved YOLOv3 in remote sensing image[J]. Computer Engineering and Applications, 2022, 58(13): 241-248.

[10]. Dong X, Yan S, Duan C. A lightweight vehicles detection network model based on YOLOv5[J]. Engineering Applications of Artificial Intelligence: The International Journal of Intelligent Real-Time Automation, 2022, 113: 1-13.

[11]. Yeh C H, Lin C H, Kang L W, et al. Lightweight Deep Neural Network for Joint Learning of Underwater Object Detection and Color Conversion[J]. IEEE Transactions on Neural Networks and Learning Systems, 2021: 99.

[12]. Fang W, Wang L, Ren P. Tinier-YOLO: A real-time object detection method for constrained environments[J]. IEEE Access, 2019, 8: 1935-1944.

[13]. Zheng Z, Wang P, Liu W, et al. Distance-IoU loss: Faster and better learning for bounding box regression[C]//Proceedings of the AAAI conference on artificial intelligence. 2020.

Cite this article

Wang,S. (2024). Marine Life Identification System Based on Deep Learning. Applied and Computational Engineering,106,174-179.

Data availability

The datasets used and/or analyzed during the current study will be available from the authors upon reasonable request.

Disclaimer/Publisher's Note

The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of EWA Publishing and/or the editor(s). EWA Publishing and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

About volume

Volume title: Proceedings of the 2nd International Conference on Machine Learning and Automation

ISBN：978-1-83558-707-2(Print) / 978-1-83558-708-9(Online)

Editor：Mustafa ISTANBULLU

Conference website: https://2024.confmla.org/

Conference date: 21 November 2024

Series: Applied and Computational Engineering

Volume number: Vol.106

ISSN：2755-2721(Print) / 2755-273X(Online)

© 2024 by the author(s). Licensee EWA Publishing, Oxford, UK. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license. Authors who publish this series agree to the following terms:
1. Authors retain copyright and grant the series right of first publication with the work simultaneously licensed under a Creative Commons Attribution License that allows others to share the work with an acknowledgment of the work's authorship and initial publication in this series.
2. Authors are able to enter into separate, additional contractual arrangements for the non-exclusive distribution of the series's published version of the work (e.g., post it to an institutional repository or publish it in a book), with an acknowledgment of its initial publication in this series.
3. Authors are permitted and encouraged to post their work online (e.g., in institutional repositories or on their website) prior to and during the submission process, as it can lead to productive exchanges, as well as earlier and greater citation of published work (See Open access policy for details).