Intelligent medical detection and diagnosis assisted by deep learning

Jingxiao Tian; Hanzhe Li; Yaqian Qi; Xiangxiang Wang; Yuan Feng

doi:10.54254/2755-2721/64/20241356

1. Introduction

According to the World Health Organization (WHO), millions of patients are delayed every year while waiting for diagnosis and treatment, resulting in significant health losses. The intelligent assisted diagnosis system can greatly shorten the diagnosis time and help doctors make diagnosis and treatment decisions more quickly, thus reducing the waiting time of patients and the waste of medical resources. With the continuous development of digital technology and artificial intelligence, [1]intelligent assisted diagnosis system has become an important innovation in the medical field. Its core purpose is to use advanced technologies such as deep learning to integrate artificial intelligence models into medical devices or software to help doctors make more accurate diagnosis and treatment decisions, thereby improving medical standards and efficiency. As an important part of basic medicine, pathology undertakes the key task of disease diagnosis and treatment. According to the data of the American Cancer Society, the accuracy of traditional pathology diagnosis is between 85% and 90%, and the introduction of intelligent assistant diagnosis system can improve the diagnostic accuracy to more than 95%. This means more accurate disease diagnosis and more effective treatment. Digital pathology combined with artificial intelligence technology brings new opportunities and challenges to pathology. [2]This paper aims to explore how intelligent assisted diagnosis systems can use advanced artificial intelligence technology to improve the accuracy and efficiency of pathological diagnosis, thereby providing critical information and support for the individualized treatment of cancer patients. Through the introduction and analysis of intelligent assistant diagnosis system, we can deeply understand its significance and potential value in the medical field, and promote the progress and application of medical science and technology.

2. Related work

2.1. The development of intelligent auxiliary diagnosis system

Artificial intelligence assisted diagnosis system is an "intelligent medical assistant" developed on the basis of artificial intelligence theory, which can assist doctors to make diagnostic decisions such as location of focal areas, disease screening and treatment plan selection. [3-4]A successful AI-assisted diagnostic system can have diagnostic capabilities comparable to expert level. A high-quality artificial intelligence medical assisted diagnosis system is of great significance for improving doctors' work efficiency, reducing the rate of medical misdiagnosis and missed diagnosis, alleviating the shortage of medical resources and greatly improving the quality of medical services. In view of the great medical value of artificial intelligence medical assisted diagnosis and treatment technology, the Chinese government attaches great importance to the development of artificial intelligence medical assisted diagnosis and treatment technology. In December 2017[5], the Ministry of Industry and Information Technology issued the Three-year Action Plan to Promote the development of a New generation of artificial Intelligence Products (2018-2020), proposing to accelerate the production and clinical introduction of medical image-assisted diagnosis and treatment systems. The introduction of the "13th Five-Year Plan" series of policies has created a good policy environment for promoting the productization and clinical application of artificial intelligence medical assisted diagnosis and treatment.

2.2. Intelligent auxiliary diagnostic systems and clinical pathology

Artificial intelligence medical assisted diagnosis system is based on deep learning, neural network and other artificial intelligence technology development, through the learning of massive data samples to learn experience, promote it to master the diagnostic ability, and ultimately can provide support for doctors diagnosis decision-making intelligent system. Among them, Artificial intelligence image-assisted diagnosis and treatment system is the current trend of industry research. Based on massive medical image data such as X-ray, CT, ultrasound and MR, and deep learning, image segmentation, data mining and other technologies, artificial intelligence image-assisted diagnosis and treatment accurately identifies and quantifies disease lesions to provide diagnostic basis for doctors.

At present, imaging AI and pathological AI have been successfully applied in the fields of disease screening, prediction and diagnosis. Researchers in several fields have constructed a large-scale CT dataset that includes novel coronavirus pneumonia [6](COVID-19), common pneumonia, and normal control populations, and developed a COVID-19 AI diagnostic system based on CT images to help accurately diagnose COVID-19. However, laboratory tests have significant advantages over imaging and pathology. The laboratory test method is simple, fast and low-cost, and doctors can achieve good diagnosis and treatment results by analyzing the test data of patients. In addition, blood routine, blood biochemistry, urine or stool test results can directly reflect the physiological and pathological changes of the disease. [7]The commonly used clinical test data has sufficient validity and stability, has been large-scale clinical practice and evaluation, which can provide clinicians with a more comprehensive guidance and suggestions for disease diagnosis and treatment.

2.3. Artificial intelligence diagnostic system and early lung cancer

Lung cancer is a malignant tumor with the highest incidence and mortality in the world. Early diagnosis and treatment of lung cancer are the most effective means to improve the clinical efficacy of lung cancer. With the popularization of [8]CT examination, a large number of people with pulmonary nodules have been found clinically. This not only provides a key screening object for the detection of early lung cancer patients, but also increases the psychological burden of patients and excessive diagnosis and treatment due to the difficulty in determining the nature of lung nodules. The AI-assisted diagnosis of pulmonary nodules effectively solves this clinical problem and greatly improves the accuracy of qualitative diagnosis of pulmonary nodules. The AI server is embedded in the hospital's network management system and connected to the image archiving and Communication System (PACS) for CT examination. When a patient performs a chest CT scan, AI automatically captures the patient's image information through [9-10]PACS, and transmits the auxiliary diagnosis results to the doctor's film reading terminal through AI calculation, which usually takes a few seconds to complete. Ai-assisted diagnosis of small pulmonary nodules does not require high CT hardware: spiral CT of 16 rows or more, plain scan, and thin slice (1.5 mm) are sufficient. Ai-assisted diagnosis of pulmonary nodules can provide us with the following information about pulmonary nodules: location, size (including longest diameter, maximum cross-sectional area and volume), nature (pure ground glass, solid, solid), and malignant probability (0% ~ 100%). For all the small nodules screened, according to the size of the malignant probability, the standard "artificial intelligence assisted lung cancer diagnosis analysis report" is obtained for doctors' reference.

3. Methodology

Because of high quantity data in CT images and blurred boundaries, tumor segmentation and classification is very hard. Machine Learning makes the diagnosis process easier and deterministic. This work has introduced one automatic lung cancer detection method to increase the accuracy and yield and decrease the diagnosis time. The main objective of this work is to detect the cancerous lung nodules from a given input lung image and to predict the lung cancer using Deep Learning technique more efficiently than the existings.

3.1. Proposed methodology

We will employ a two-step approach to preprocess our raw data and enhance its usability for further analysis:

First, Histogram Equalization: We will utilize histogram equalization to enhance the contrast of our images. By spreading out the most frequent intensity values, histogram equalization effectively stretches the intensity range of the image, resulting in increased global contrast. This step is crucial for improving the overall quality and interpretability of our images, especially when the usable data is represented by closely clustered intensity values. Second, Threshold Segmentation: Following histogram equalization, we will perform threshold segmentation to further preprocess our images. Thresholding is a form of image segmentation where pixels are converted into a binary format, typically black and white. This conversion simplifies the image and facilitates the identification of areas of interest while disregarding irrelevant portions. By applying threshold segmentation, we aim to isolate and highlight specific regions within the images, making them more amenable to subsequent analysis and interpretation. By implementing this proposed methodology, we anticipate achieving significant improvements in the clarity, contrast, and interpretability of our raw data, thereby enhancing the efficacy of our subsequent analytical processes.

Figure 1. adenocarcinoma unprocessed image

- The first subplot (`plt. subplot(1, 2, 1)`) displays the raw image fetched from a directory specified by `RAW_DIR`. The image is loaded using OpenCV's `cv2. imread` function and is plotted using `plt.imshow`(figure 1).

- The second subplot (`plt. subplot(1, 2, 2)`) displays the histogram-equalized image fetched from a directory specified by `DEST_DIR`. Similar to the first subplot, the image is loaded and plotted.

- `plt. suptitle('Unprocessed vs Processed image')` adds a super title to the overall figure, providing context for the comparison.

- `plt.show()` displays the entire figure containing both subplots.

3.2. Data Pre-processing

First, we load the enhanced data set from the specified directory. Data enhancement techniques include rotation, translation, cutting, flipping, etc. These operations help to increase the diversity and richness of the data, thereby improving the generalization ability and robustness of the model. Next, we resize the image to fit the input requirements of the model. After Resizing[11], we organize the images into batches, each containing a certain number of image samples. The image data, organized in batches, will be used to train and evaluate the model. During the training process, the model will take samples from the training data batch by batch and update the parameters according to the loss function. [12]During the evaluation process, the model will take samples from validation or test data batch by batch and calculate the performance of the evaluation metrics. This process is iterated until the model reaches the desired level of performance or the training reaches a specified number of rounds.

During the training process, we use three callback functions to monitor the performance of the model and adjust the learning rate: 'ReduceLROnPlateau', 'ModelCheckpoint' and 'EarlyStopping'. These callback functions help to optimize the training process of the model and improve the performance and generalization ability of the model. First, the 'ReduceLROnPlateau' callback function monitors the loss value on the validation set and reduces the learning rate when the loss value stops improving. This helps the model to adjust the parameters more carefully, and improves the convergence speed and performance of the model during training. Second, the 'ModelCheckpoint' callback saves the best weights of the model at the end of each training cycle. This ensures that the best model parameters obtained during training are not lost, but are saved for later use. Finally, the 'EarlyStopping' callback monitors the performance of the model on the verification set and stops training if the model performance does not improve within a certain number of rounds. This helps to prevent overfitting of the model and improve the generalization ability of the model. After the training is completed, we evaluate the verification set and get the corresponding evaluation index. Based on the evaluation results, we can judge how well the model is trained and how well it performs on the new data.

3.3. Comparison between processed and unprocessed image

The evaluation results of different models, including CNN, VGG16, VGG19, MOBILENET, ResNet50, Xception, and InceptionV3, have been collected and organized into a [13]DataFrame. This DataFrame contains various evaluation metrics such as Accuracy, Precision, Recall, AUC, and F1 score for each model. Each row in the DataFrame represents a different model, while each column represents a specific evaluation metric. The values in the DataFrame indicate the performance of each model on the corresponding metric(figure 2).

Figure 2. Comparison between processed and unprocessed image

A bar plot is then generated to visualize and compare the performance of different models across the evaluation metrics. Each model is represented by a bar, and the height of the bar indicates the value of the corresponding metric. Different colors are used to distinguish between different evaluation metrics. This visualization allows for a quick comparison of the performance of different models across multiple evaluation metrics, providing insights into the strengths and weaknesses of each model.

3.4. Splitting the processed images

Based on the methodology described, the proposed approach aims to enhance the quality and interpretability of lung images for more efficient lung cancer detection using deep learning techniques. The key steps involve histogram equalization and threshold segmentation to preprocess the raw data, followed by data pre-processing and model training using convolutional neural networks (CNNs). Additionally, callback functions are employed during the training process to optimize model performance and prevent overfitting.

Figure 3. Histogram of training results

Evaluation of the trained models, including CNN, [14]VGG16, VGG19, MOBILENET, ResNet50, Xception, and InceptionV3, revealed varying performance across different evaluation metrics. Among these models, VGG16, VGG19, and MOBILENET demonstrated high accuracy, precision, recall, AUC, and F1 scores, indicating their effectiveness in lung cancer detection(figure 3). However, the performance of CNN was comparatively lower across most metrics.The processed images exhibited clearer boundaries and enhanced contrast, leading to improved accuracy and reliability in lung cancer detection.

4. Conclusion

In conclusion, the development of intelligent auxiliary diagnosis systems, fueled by advancements in artificial intelligence technology, represents a significant milestone in healthcare. These systems, adept at disease screening, localization, and treatment planning, hold immense potential in improving diagnostic accuracy and efficiency.[15] As AI continues to evolve, it is imperative to foster collaboration between technology developers, healthcare providers, and policymakers to realize the full potential of AI-assisted diagnosis systems in improving global healthcare standards. In the realm of lung cancer diagnosis and treatment, AI-assisted integrated solutions offer a beacon of hope for early detection and personalized treatment. By leveraging deep learning algorithms and medical imaging data, these systems enable swift and accurate identification of pulmonary nodules, facilitating timely interventions and improved patient prognosis. As we embark on this journey towards AI-driven healthcare, collaboration, innovation, and regulatory support will be pivotal in realizing the transformative potential of AI-assisted diagnosis systems in combating lung cancer and advancing patient care.

References

[1]. Radhika, P. R., Rakhi AS Nair, and G. Veena. "A comparative study of lung cancer detection using machine learning algorithms." 2019 IEEE International Conference on Electrical, Computer and Communication Technologies (ICECCT). IEEE, 2019.

[2]. Shakeel, P. Mohamed, Mohd Aboobaider Burhanuddin, and Mohamad Ishak Desa. "Lung cancer detection from CT image using improved profuse clustering and deep learning instantaneously trained neural networks." Measurement 145 (2019): 702-712.

[3]. Asuntha, A., and Andy Srinivasan. "Deep learning for lung Cancer detection and classification." Multimedia Tools and Applications 79.11 (2020): 7731-7762.

[4]. Joshua, Eali Stephen Neal, Midhun Chakkravarthy, and Debnath Bhattacharyya. "An Extensive Review on Lung Cancer Detection Using Machine Learning Techniques: A Systematic Study." Rev. d'Intelligence Artif. 34.3 (2020): 351-359.

[5]. Alsinglawi, Belal, et al. "An explainable machine learning framework for lung cancer hospital length of stay prediction." Scientific reports 12.1 (2022): 1-10.

[6]. Yan, Sha, et al. "Computed Tomography Images under Deep Learning Algorithm in the Diagnosis of Perioperative Rehabilitation Nursing for Patients with Lung Cancer." Scientific Programming 2022 (2022).

[7]. Elnakib, Ahmed, Hanan M. Amer, and Fatma EZ Abou-Chadi. "Early lung cancer detection using deep learning optimization." (2020): 82-94.

[8]. Wang, Yong, et al. "Construction and application of artificial intelligence crowdsourcing map based on multi-track GPS data." arXiv preprint arXiv:2402.15796 (2024).

[9]. Zheng, Jiajian, et al. "The Random Forest Model for Analyzing and Forecasting the US Stock Market in the Context of Smart Finance." arXiv preprint arXiv:2402.17194 (2024).

[10]. Yang, Le, et al. "AI-Driven Anonymization: Protecting Personal Data Privacy While Leveraging Machine Learning." arXiv preprint arXiv:2402.17191 (2024).

[11]. Cheng, Qishuo, et al. "Optimizing Portfolio Management and Risk Assessment in Digital Assets Using Deep Learning for Predictive Analysis." arXiv preprint arXiv:2402.15994 (2024).

[12]. Zhu, Mengran, et al. "Utilizing GANs for Fraud Detection: Model Training with Synthetic Transaction Data." arXiv preprint arXiv:2402.09830 (2024).

[13]. Wu, Jiang, et al. "Data Pipeline Training: Integrating AutoML to Optimize the Data Flow of Machine Learning Models." arXiv preprint arXiv:2402.12916 (2024).

[14]. Yu, Hanyi, et al. "Machine Learning-Based Vehicle Intention Trajectory Recognition and Prediction for Autonomous Driving." arXiv preprint arXiv:2402.16036 (2024).

[15]. Huo, Shuning, et al. "Deep Learning Approaches for Improving Question Answering Systems in Hepatocellular Carcinoma Research." arXiv preprint arXiv:2402.16038 (2024).

Cite this article

Tian,J.;Li,H.;Qi,Y.;Wang,X.;Feng,Y. (2024). Intelligent medical detection and diagnosis assisted by deep learning. Applied and Computational Engineering,64,120-125.

Data availability

The datasets used and/or analyzed during the current study will be available from the authors upon reasonable request.

Disclaimer/Publisher's Note

The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of EWA Publishing and/or the editor(s). EWA Publishing and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

About volume

Volume title: Proceedings of the 6th International Conference on Computing and Data Science

ISBN：978-1-83558-425-5(Print) / 978-1-83558-426-2(Online)

Editor：Alan Wang, Roman Bauer

Conference website: https://www.confcds.org/

Conference date: 12 September 2024

Series: Applied and Computational Engineering

Volume number: Vol.64

ISSN：2755-2721(Print) / 2755-273X(Online)

© 2024 by the author(s). Licensee EWA Publishing, Oxford, UK. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license. Authors who publish this series agree to the following terms:
1. Authors retain copyright and grant the series right of first publication with the work simultaneously licensed under a Creative Commons Attribution License that allows others to share the work with an acknowledgment of the work's authorship and initial publication in this series.
2. Authors are able to enter into separate, additional contractual arrangements for the non-exclusive distribution of the series's published version of the work (e.g., post it to an institutional repository or publish it in a book), with an acknowledgment of its initial publication in this series.
3. Authors are permitted and encouraged to post their work online (e.g., in institutional repositories or on their website) prior to and during the submission process, as it can lead to productive exchanges, as well as earlier and greater citation of published work (See Open access policy for details).