Exploring the application of machine learning for skin cancer image identification

Exploring the application of machine

learning for skin cancer image

identification

Shuyue Chen

New Channel Jinan Jinqiu A-Level College,

Block C, Yuquan Shunshin Hotel, 68 Luoyuan Street, Lixia District, Jinan City

1625830192@qq.com

Abstract.

Cancer encompasses a broad spectrum of diseases marked by abnormal cell growth that can potentially spread throughout the body. Common types include breast, lung, colon, rectal, and skin cancers, among others. Early detection is crucial for effective treatment of skin cancer, which ranks among the prevalent forms of cancer. However, traditional diagnostic methods are time-consuming and depend on the expertise of dermatologists. This research aims to investigate the application of machine learning (ML) for identifying skin cancer from images, aiming to improve early detection and diagnosis.

Various image preprocessing techniques, feature extraction methods, and ML algorithms are applied to a dataset of skin lesion images. Various machine learning models are assessed and compared based on relevant metrics for their effectiveness in detecting skin cancer.

Preliminary findings indicate that ML algorithms can achieve high accuracy in skin cancer identification, potentially improving diagnostic efficiency and accessibility. However, the research also highlights obstacles such as imbalanced data and the need for model interpretability, which must be addressed for practical implementation.

This research contributes to the expanding knowledge on ML applications in healthcare, particularly in dermatology. It highlights the potential of ML in skin cancer identification and provides insights into the challenges and limitations that need to be overcome for successful implementation. The study emphasizes the necessity for future research to refine these techniques and enhance their clinical applicability.

Keywords: skin cancer, machine learning, image identification

1 Introduction

Cancer remains a significant global healthcare burden, with nearly 10 million deaths (excluding non-melanoma skin cancer) reported worldwide in 2020. [4] Prostate cancer in males, breast cancer in females, and gender-neutral lung cancer rank among the most commonly diagnosed types of cancer, skin cancer encompasses both malignant melanoma and non-melanoma skin cancer (NMSC) is prevalent and its incidence is increasing, especially among Caucasians. In the United States, the number of people who develop skin cancer each year far exceeds the number of people who develop other types of cancer. [1]

Skin cancer encompasses a collection of disorders where abnormal cells in the skin proliferate uncontrollably, leading to the formation of tumors. Exposure to ultraviolet rays is the primary cause of these cancers. Detecting skin cancer early is essential for effective treatment, and specialists employ various techniques, including dermoscopy and biopsies, to assess the malignancy of skin lesions. However, manual diagnosis can be challenging and time-consuming, especially for individuals with underlying skin conditions. [5]

To address the challenges of manual diagnosis and limited access to expert dermatologists, computer-assisted diagnosis using machine learning (ML) has emerged as a potential solution. ML algorithms have shown promise in various medical applications, including cancer diagnosis, by improving prediction accuracy and automating classification tasks. [6] Convolutional neural networks (CNNs) in deep learning have now evolved into a powerful technique for image recognition and classification. This capability enables machine learning models to detect skin cancer in images with accuracy comparable to that of dermatologists. [7]

Despite the potential of ML for identifying skin cancer, its current problems of data imbalance, varying image quality, model interpretability and ethical considerations still require thinking about solutions. Understanding these challenges are essential for the practical implementation of ML in dermatology.

This research aims to explore the application of ML for skin cancer image identification, seeking to understand the performance of different ML algorithms in skin cancer identification and uncover the challenges and limitations of their implementation.

This research contributes to the expanding field of knowledge regarding machine learning applications in healthcare. It provides insights into the potential of ML in skin cancer identification and the challenges that need to be overcome for its successful implementation. The findings could inform future research and practice in dermatology.

2 Research Review

2.1 Background and Classification of Skin Cancer

Skin cancer originates in the skin's cells and is categorized into three primary types: basal cell carcinoma, squamous cell carcinoma, and melanoma. [54]

2.1.1 Basal Cell Carcinoma (BCC)

Basal Cell Carcinoma (BCC) is the prevailing type of skin cancer and is also commonly diagnosed cancer overall. BCC originates from the uncontrolled and abnormal growth of basal cells, which are in the lower part of the epidermis (the outermost layer of the skin). [8] The main cause of BCC is prolonged exposure to ultraviolet radiation from the sun or tanning beds. Additional risk factors include fair skin, a history of sunburns, a compromised immune system, and a family history of skin cancer. [4] BCC tends to occur on sun-exposed areas of the body, such as the face, neck, scalp, and hands.

In most cases, BCC grows slowly and seldom spreads to other parts of the body typically. However, if left untreated, it still can invade surrounding tissues and cause considerable damage. The key to preventing complications and disfigurement is early detection and treatment.

BCC can present with various clinical features. A typical presentation of basal cell BCC is a shiny, pearly skin nodule. However, superficial BCC may appear as a red patch similar in appearance to eczema. Infiltrative or morpheaform BCC can manifest as skin thickening or scar-like tissue, making making diagnosis challenging without tactile examination and a skin biopsy. Distinguishing BCC from acne scars, actinic elastosis, and recent cryodestruction inflammation can be visually difficult. [9]

2.1.1.1 Diagnosis of BCC

To diagnose BCC, a skin biopsy is commonly conducted to obtain tissue for histopathological analysis. The most commonly used method is a shave biopsy that is performed under local anesthesia. [10] While nodular BCC can often be diagnosed based on clinical examination, other variants can be difficult to differentiate from benign lesions such as sebaceomas, intradermal nevi, fibrous papules, hypertrophic scarring, and early acne scars. Exfoliative cytology methods can be useful in confirming the diagnosis of BCC when there is a high clinical suspicion, but their usefulness is unclear in other cases. [11]

2.1.2 Squamous Cell Carcinoma (SCC)

Squamous cell carcinoma is the second most common form of skin cancer, after basal cell carcinoma. It is a type of skin cancer that begins in the squamous cells, the flat cells located in the outermost layer of the skin.

SCC commonly occurs on sun-exposed areas like the face, neck, hands, arms, and ears. However, it can also develop in less sun-exposed regions such as the genital area, inside the mouth, and on mucous membranes. Additionally, SCC may arise in areas with scars, chronic wounds, or prolonged inflammation.

The appearance of SCC can vary, but it commonly presents as a scaly, rough, or crusted lesion. It may also appear as a firm, pinkish bump, or a red, inflamed patch of skin. In some cases, SCC can ulcerate or bleed. It is important to note that SCC can sometimes resemble other skin conditions, such as eczema, psoriasis, or non-cancerous growths, making it necessary to undergo a proper examination and biopsy for an accurate diagnosis. [12]

2.1.2.1 Diagnosis of SCC

Early detection and prompt treatment are crucial. If not treated, SCC can grow deeper into the skin and potentially spread to nearby lymph nodes or other organs, leading to more serious complications. [54] With the development of diagnostic techniques, the accuracy of SCC diagnosis has much improved. [13] Moreover, research on targeted therapies and immunotherapies is ongoing, offering promising options for the treatment of advanced or recurrent SCC.

2.1.3 Melanoma

Melanoma is a skin cancer caused by melanocytes. Melanocytes are skin cells that make pigment, and although melanoma is not a common skin cancer, it is very aggressive and dangerous. If not detected and treated in time, it can lead to death. [55]

Melanoma can develop in any part of the body, including areas that are not typically exposed to sunlight, such as the soles of the feet, underneath the nails, and palms of the hands.

However, it is most commonly found on sun-exposed areas of the skin, such as the face, neck, arms, and legs. Melanoma can appear as a new mole or as a change in an existing mole. It may also appear as a dark spot or patch of skin that is asymmetrical, has an irregular border, is multicolored, or is larger than a pencil eraser.

2.1.3.1 Diagnosis of Melanoma

Melanoma is usually diagnosed through a skin biopsy. If melanoma is detected, then more in-depth tests are performed to determine the stage and extent of the cancer. [56]

Early detection of melanoma is essential for enhancing patient outcomes. Regular skin examinations and self-monitoring of moles and skin lesions can aid in the early identification of suspicious changes. When detected and treated at an early stage, the chances of cure are significantly higher.

In summary, understanding the etiology and clinical characteristics of different types of skin cancer is crucial for early diagnosis and effective treatment. This knowledge empowers healthcare professionals and researchers to take appropriate preventive and therapeutic measures. Regular skin checks and seeking medical advice for suspicious skin lesions are essential for early detection and timely intervention. Additionally, understanding the type and stage of skin cancer is fundamental for formulating tailored treatment plans and improving patient prognosis. Public education and awareness campaigns can play a significant role in promoting skin cancer prevention, early detection, and improved patient outcomes

2.2 Development of Image Recognition Technology

Image recognition technology is a technology that uses computers to process, analyse and understand images in order to identify various targets and objects with different patterns. It is an essential component of computer vision, a field that encompasses methods for acquiring, processing, analyzing, and comprehending images. [14] The evolution of image recognition technology can be traced through several key stages.

2.2.1 Generation Stage (1960-2000s)

2.2.1.1 Early Stage (1960s-1970s)

The concept of image recognition technology emerged in the 1960s, where early researchers attempted to mimic the human visual system by integrating a camera with a computer to describe the contents of images. During this period, foundational algorithms such as edge detection, line labeling, and motion estimation were developed. However, these algorithms had limitations in recognizing complex images and patterns. [15]

2.2.1.2 Traditional Image Processing Era (1980s-1990s)

This stage witnessed the use of traditional image processing methods that relied on manually designed feature extraction and classification algorithms. Techniques like thresholding, histogram equalization, and Fourier transforms were employed to preprocess the images. Subsequently, features such as edges, corners, and textures were manually extracted from the preprocessed images. These features were then fed into classification algorithms to recognize objects in the images. While effective for specific image recognition tasks, these methods were not easily scalable and faced

challenges with complex and diverse images. [15]

2.2.1.3 Machine Learning Era (2000s)

The advent of machine learning marked a significant advancement in image recognition technology. Rather than manually designing and extracting features, machine learning algorithms were utilized to automatically learn these features from the images. Techniques like Support Vector Machines (SVM) and Random Forests were employed for image classification based on the learned features. These methods proved to be more scalable and achieved superior performance on diverse and complex images compared to traditional image processing approaches. [15]

2.2.2 Deep Learning Era (2010s-Present)

The Deep Learning Era, spanning from the 2010s to the present, has brought a revolutionary transformation to the field of image recognition technology. At the forefront of this era is the extensive utilization of deep learning models, notably Convolutional Neural Networks (CNNs), which have emerged as the predominant choice for a variety of image recognition tasks.

2.2.2.1 Convolutional Neural Networks (CNNs)

CNNs are a category of deep learning models explicitly crafted to autonomously and dynamically acquire hierarchical spatial features from images. CNNs have shown exceptional performance in facial recognition, object detection, and understanding scenes. [2]

2.2.2.2 Distinctive Features of CNNs

The strength of CNNs lies in their ability to learn abstract and high-level features directly from raw pixel data, obviating the need for manual feature extraction. Utilizing filters applied to smaller image portions, CNNs transform the images into a more manageable form while retaining critical features for accurate predictions. The multi-layered architecture of CNNs enables them to detect various image patterns, making them well-suited to handle the complexity and variability of real-world images. This adaptability enhances their accuracy and robustness in image recognition tasks. [16]

2.2.2.3 Data and Training

CNNs require substantial amounts of annotated image data for training. Large-scale datasets, like ImageNet, have played a pivotal role in boosting the performance of CNNs. ImageNet provides millions of labeled images, serving as training samples for CNNs to learn and recognize patterns and features. With continuous exposure to diverse images, these models progressively improve their ability to correctly classify new and unseen images. [17]

2.2.2.4 Ongoing Challenges

Despite the success of CNNs, several challenges persist. The first pertains to the demand for vast amounts of labeled data, as obtaining such data can be laborious and time-consuming. Additionally, training deep CNNs necessitates substantial computational resources, which can be expensive and time-intensive. Another challenge involves the lack of interpretability in these complex models, making it difficult to understand the decision-making processes within the neural network. [16]

2.2.2.5 Continued Research and Development

The continuous development of image recognition technology has significantly contributed to various fields, including healthcare, autonomous vehicles, surveillance, and more. The integration of advanced algorithms, machine learning, and deep learning techniques continues to drive advancements in image recognition, leading to more sophisticated and accurate applications across diverse industries.

2.2.3 Pre-training Models and Transfer Learning

Pre-training models and transfer learning are pivotal concepts in the domain of deep learning, particularly concerning image recognition tasks.

2.2.3.1 Pre-training Models

Pre-training entails training a deep learning model on an extensive dataset, often comprising millions of images. The model's learned weights are then saved for future use, enabling it to acquire a robust and generalizable set of features from the large dataset. These pre-trained models are frequently made publicly available and serve as a valuable starting point for various image recognition tasks. Examples of pre-trained models include VGG16, ResNet, and Inception, which have been pre-trained on the vast ImageNet dataset. [18]

2.2.3.2 Transfer Learning

Transfer learning is a machine learning (ML) technique.The basic concept is that it refers to the fine-tuning of a model pre-trained for one task to be used for a new related task. [57]

2.2.3.3 Implementation in Image Recognition

In the context of image recognition, transfer learning typically involves two steps. First, the pre-trained model (except for the final classification layer) is employed to extract features from the new dataset. This process, commonly referred to as "fine-tuning," entails slightly adjusting the weights of the pre-trained model based on the new data. Second, a new classification layer is added to the model and trained on the new dataset. [19]

2.2.3.4 Advantages of Transfer Learning

Transfer learning offers several advantages. It enables the training of robust models even with limited data for the new task. Furthermore, it reduces the computational resources required for training, as the model only needs to learn the final classification layer instead of the entire model from scratch. [20]

2.2.3.5 Significance in Deep Learning for Image Recognition

Both pre-training models and transfer learning have played pivotal roles in the success of deep learning for image recognition. These techniques have facilitated the development of robust and accurate models, even in scenarios where data and computational resources are constrained.

The evolution of image recognition technology has transitioned from traditional image processing methods to machine learning methods, and subsequently, to the emergence of deep learning. Within this progression, techniques such as pre-training models and transfer learning have played instrumental roles. By continually advancing the theoretical foundations and application forms throughout these stages of development, image recognition technology has experienced rapid growth and extensive adoption. [21]

2.3 Application of Machine Learning in Skin Cancer Image Recognition

2.3.1 Image Classification and Diagnosis

One of the most prevalent applications of machine learning in skin cancer image recognition is the classification of skin lesions. Machine learning algorithms can be trained to accurately classify various types of skin lesions based on their visual characteristics.

To train a machine learning model, a sizable dataset of skin lesion images is necessary. These images are typically sourced from medical databases or obtained through collaborations with dermatologists. Machine learning algorithms rely on a set of distinctive features to differentiate between different types of skin lesions. These features, such as color, texture, and shape, are extracted from the skin lesion images. Subsequently, a machine learning algorithm, often leveraging CNN, is trained on the dataset. CNNs are especially effective as they can learn intricate features from the images. Once trained, the model is evaluated on a separate dataset to assess its accuracy. Metrics like sensitivity, specificity, and accuracy are used to measure the model's performance. Once deemed accurate, the model can be deployed in clinical settings to assist dermatologists in diagnosing skin lesions. [22]

In recent years, there has been an effort to develop systems to help optimise the classification of skin cancers, and one such system is computer-aided diagnosis (CAD). Initially, these systems relied on traditional machine learning (ML) algorithms before the advent of deep learning. However, ML-based methods faced challenges in feature engineering and were limited in their capacity to diagnose a wide range of skin diseases. [23]

Deep learning algorithms, notably Convolutional Neural Networks (CNNs), have effectively overcome these challenges by autonomously learning semantic features from large-scale datasets. This advancement has resulted in significantly higher accuracy and efficiency in skin cancer classification. Consequently, deep learning-based methods have become the predominant approach for addressing skin cancer classification tasks, consistently delivering promising outcomes.

2.3.2 Lesion Segmentation and Localization

Machine learning methods are instrumental in assisting with the localization and segmentation of skin cancer lesions. By training models, it becomes possible to automatically identify the regions of interest within images and generate accurate segmentation results. This capability holds significant value in various aspects of medical practice, including lesion assessment, surgical guidance, and treatment planning.

Through machine learning-based segmentation, the algorithm can accurately outline the boundaries of skin cancer lesions in medical images. [24] This allows for precise delineation of the extent and size of the lesion, aiding dermatologists in making informed decisions regarding diagnosis and treatment. Moreover, lesion segmentation is particularly valuable in the context of treatment planning, as it enables physicians to assess the potential invasiveness of the cancer and tailor therapeutic approaches accordingly. [25]

For surgical procedures, accurate lesion localization is of paramount importance. Machine learning models can help identify the exact location of the skin cancer lesion on a patient's body, enabling surgeons to plan and perform targeted excisions. [26] This not only improves the surgical outcomes but also reduces the risk of unnecessary tissue removal, preserving healthy skin and minimizing postoperative scarring.

Furthermore, in dermatology, the assessment of treatment response is critical in monitoring the progress of skin cancer patients. Machine learning-based segmentation can aid in tracking changes in the size and shape of lesions over time, providing objective measures for evaluating the effectiveness of different treatments.

Overall, machine learning-based lesion segmentation and localization have significantly enhanced the accuracy and efficiency of all three facets of skin cancer diagnosis, treatment, and monitoring. It stands as a powerful tool in this regard. technology continues to evolve, these applications are expected to play an increasingly significant role in clinical practice, benefiting both patients and healthcare professionals alike.

2.3.3 Disease Staging and Risk Assessment

It is clear that machine learning plays a crucial role in disease staging and risk assessment of skin cancer patients. By analyzing the features extracted from skin cancer images and integrating clinical data, machine learning models can accurately determine the stage of the disease and assess the patient's risk profile. This process aids clinical decision-making and provides more precise diagnostic and treatment recommendations for healthcare professionals.

Machine learning algorithms leverage large datasets of patient information and clinical data to learn patterns and associations between image features and disease progression. [27] As a result, these models can predict the malignancy level of skin lesions and the likelihood of disease progression. By identifying high-risk cases early on, physicians can prioritize timely interventions and treatments, leading to better patient outcomes.

Additionally, machine learning-based risk assessment allows dermatologists to tailor treatment plans to individual patients based on their unique risk profiles. [28] This personalised approach improves the effectiveness of the treatment while reducing potential side effects and unnecessary interventions. It also helps optimize healthcare resource allocation by focusing on patients who are at higher risk of disease progression.

Furthermore, disease staging and risk assessment are essential for patient prognosis. Machine learning models can provide insights into the potential course of the disease, helping patients and healthcare providers make informed decisions about long-term care and follow-up.

The integration of machine learning in disease staging and risk assessment marks a significant advancement in the field of dermatology. As these models continue to evolve and improve, they hold great promise in improving the accuracy of skin cancer diagnosis, prognosis, and treatment planning. Ultimately, this technology has the potential to enhance patient care and outcomes, contributing to the advancement of personalized medicine in skin cancer management.

2.3.4 Computer-Aided Diagnosis and Decision Support

Machine learning in skin cancer image recognition also extends to computer-aided diagnosis (CAD) and decision support, providing valuable assistance to medical professionals in the diagnostic process and treatment planning.

2.3.4.1 Computer-Aided Diagnosis

Machine learning models can serve as valuable tools for dermatologists in the diagnostic workflow. By analyzing skin lesion images, the model can make predictions about the likelihood of a lesion being malignant or benign, assisting dermatologists in making more informed decisions. [29] The model's predictions can act as a preliminary assessment, complementing the dermatologist's expertise and aiding in the early detection of potential skin cancers. Moreover, CAD systems can help reduce the risk of misdiagnosis and provide a second opinion, especially in cases where dermatologists encounter challenging or rare skin lesions. This collaborative approach between machine learning models and medical professionals enhances the accuracy and efficiency of the diagnostic process.

2.3.4.2 Decision Support

Beyond diagnosis, machine learning models can offer decision support to dermatologists in formulating personalized treatment plans for patients. [30] By integrating patient data and image analyses, the model can generate comprehensive information about the disease stage, potential risks, and treatment options. Dermatologists can leverage this data-driven insight to tailor treatments based on individual patient characteristics, ensuring more effective and targeted care.

Additionally, decision support systems can keep track of treatment outcomes and patient responses over time, providing continuous feedback for refining treatment strategies and optimizing patient care.

The integration of machine learning models as decision support tools complements the expertise of dermatologists, empowering them with a deeper understanding of skin lesions and their implications. The combined efforts of human expertise and data-driven analysis led to more accurate diagnoses, better treatment decisions, and improved patient outcomes.

However, it is important to note that machine learning models are not meant to replace dermatologists, but rather to work collaboratively with them. The role of dermatologists in the diagnostic and treatment process remains vital, as they provide clinical judgment, interpret model outputs, and make the final decisions. [30] By harnessing the potential of machine learning as a supportive tool, the field of skin cancer image recognition is advancing towards more comprehensive and patient-centered care.

3 Discussion/ Development

3.1 The accuracy of skin cancer image recognition

In this section, we delve into the accuracy of skin cancer image recognition using machine learning techniques. We will initially introduce the datasets and experimental designs employed to assess accuracy. Subsequently, we will demonstrate the performance of various machine learning models in skin cancer image recognition, encompassing their classification accuracy and diagnostic capabilities for different types of skin cancer lesions.

3.1.1 Datasets and Experimental Designs

The reliability and consistency of open datasets are pivotal in the development of algorithms, especially since contemporary solutions predominantly rely on data-driven approaches. Neural network-based diagnostic algorithms necessitate a substantial number of annotated images for effective training. Nevertheless, the availability of high-quality dermatoscopic images accompanied by dependable diagnoses is restricted and often confined to a limited number of disease categories. [31]

The PH2 dataset, introduced by Mendonça et al. in 2013, consisted of 200 dermatoscopic images comprising 160 nevi and 40 melanomas. While pathology information was available for melanomas, it was sparse for most nevi. The PH2 dataset, owing to its comprehensive metadata and public accessibility, emerged as a benchmark for studies on computer-based melanoma diagnosis. [32]

To advance research in automated diagnosis of dermoscopic images, the scientists introduced the HAM10,000 ("human-to-machine, 10,000 training images") dataset. The dataset consists of 10,015 dermoscopy images from a variety of populations and imaging modalities, specifically designed for machine learning purposes and publicly available through the ISIC archive. The HAM10000 dataset serves as a valuable resource for training machine learning algorithms, enabling comparison between automated systems and human experts. It encompasses a comprehensive range of diagnostic categories related to pigmented lesions, a substantial portion of which have been pathologically confirmed, while others have been validated through follow-up, expert consensus, or confirmation via in-vivo confocal microscopy. This benchmark dataset facilitates algorithm development and evaluation in dermatology. Researchers can leverage this dataset to train and assess their machine learning models, fostering advancements and promoting collaboration within the field. [33]

Established in 2016, the International Skin Imaging Collaboration (ISIC) aims to advance machine learning solutions in skin cancer diagnosis. The ISIC dataset has evolved significantly from ISIC 2016 to ISIC 2020, encompassing 2,056 patients across three continents. Among these patients, 20.8% have at least one melanoma and 79.2% do not. The dataset includes a total of 33,126 dermoscopic images, with an average of 16 lesions per patient. Within these images, there are 584 (1.8%) histopathologically confirmed melanomas alongside benign melanoma mimickers.

The ISIC 2020 dataset, particularly through initiatives like the SIIM-ISIC Melanoma Classification challenge, incorporates patient-level contextual information, enhancing its real-world applicability. This dataset serves various scenarios and serves as a valuable resource for researchers and clinicians working on algorithms for melanoma diagnosis. The inclusion of patient-level contextual information and histopathologically confirmed cases significantly enhances the dataset's utility, enabling the development of more accurate and reliable diagnostic tools for melanoma classification. [34]

ISIC 2020 marks a substantial improvement over ISIC 2016 due to its larger dataset size, richer patient-level contextual information, inclusion of histopathologically confirmed melanomas, and the introduction of challenges like the Melanoma Classification challenge. These advancements establish ISIC 2020 as an indispensable asset for researchers and clinicians in the automated diagnosis of skin lesions, particularly in the context of melanoma.

3.1.2 Performance Comparison of Different Machine Learning Models

In this section, we will delve into the performance comparison of various machine learning models in the context of skin cancer image recognition. We will start by introducing common machine learning algorithms, including Support Vector Machines (SVM), decision trees, and random forests, providing a comprehensive overview of their characteristics and functionalities. Subsequently, we will conduct a comparative analysis of accuracy and efficiency among these algorithms within the realm of skin cancer image classification tasks.

3.1.2.1 Introduction to Common Machine Learning Algorithms

Several machine learning algorithms have been widely applied to skin cancer image recognition. Among these, Support Vector Machines (SVM), decision trees, and random forests are notable examples. [35] Support Vector Machines employ a linear or non-linear separation boundary to classify data points. Decision trees, on the other hand, construct a tree-like structure to make decisions by splitting data based on features. Random forests aggregate predictions from multiple decision trees to improve accuracy and mitigate overfitting.

While traditional algorithms have been widely used in the past, they often require careful feature engineering and may not capture complex patterns or contextual information present in skin lesion images. Deep learning approaches, especially convolutional neural networks (CNNs), have shown significant advancements in skin cancer recognition by automatically learning relevant features from the raw image data.

3.1.2.2 Accuracy and Efficiency Comparison

To comprehensively evaluate the performance of these algorithms, it is essential to compare their accuracy and efficiency in skin cancer image classification tasks. Accuracy pertains to the models' ability to correctly classify different types of skin lesions, while efficiency encompasses aspects like training time and prediction time. An in-depth analysis of these factors will provide insights into the strengths and weaknesses of each algorithm in a skin cancer diagnosis context.

3.1.2.3 Model Selection and Performance Trade-offs

Each machine learning algorithm exhibits distinct advantages and limitations. [39] For instance, SVMs are effective in handling high-dimensional data and are particularly useful when the data is not linearly separable. Decision trees are interpretable and can capture complex relationships in the data. Random forests combine multiple decision trees to enhance accuracy and generalization. The choice of the most appropriate algorithm depends on the specific requirements of the application, the available data, and computational resources. Balancing accuracy, interpretability, computational complexity, and robustness will be pivotal in selecting the optimal algorithm for skin cancer image recognition tasks.

3.2 Challenges and Restraints

A major challenge is the neural network-based skin cancer detection techniques that require extensive training. Analysing and interpreting features from dermoscopic images requires thorough training, a process that is lengthy and requires very powerful hardware resources. This requirement for extensive training creates a significant barrier to the development and implementation of effective skin cancer detection systems.

Deep learning-based skin cancer detection heavily relies on powerful hardware resources with high GPU power. [40] These resources are essential for the neural network software to effectively extract the distinctive features from lesion images, which plays a crucial role in achieving improved accuracy in skin cancer detection. However, a significant challenge in training deep learning models for skin cancer detection arises from the limited availability of high computing power. This scarcity hampers the progress and implementation of deep learning techniques in this domain.

The utilization of real-world datasets for skin cancer diagnosis often encounters a significant issue of class imbalance. [41] These imbalanced datasets consist of a varying number of images for each type of skin cancer. For instance, there might be a substantial number of images available for common skin cancer types, while only a limited number of images exist for the less common types. Consequently, this imbalance poses challenges in deriving meaningful insights from the visual features of dermoscopic images and drawing generalizations that encompass all types of skin cancer.

Standard dermoscopic datasets primarily consist of images of young individuals, while certain types of skin cancers, such as Merkel cell cancer, BCC, and SCC, typically manifest in individuals over the age of 65. [40] To ensure accurate diagnosis of skin cancer in elderly patients, it is crucial for neural networks to have access to a sufficient number of images depicting individuals aged 50 years and above. This inclusion of diverse age groups within the training data is essential for enhancing the effectiveness of neural networks in accurately detecting skin cancer in older individuals.

The quality and quantity of datasets play a crucial role in the performance and reliability of machine learning models for skin cancer recognition. The quality of the data is important to ensure that the datasets are accurately labeled by expert dermatologists or pathologists. Inaccurate or inconsistent labeling can lead to biased or incorrect model predictions. Additionally, the datasets should ideally include diverse skin types, lesion types, and demographics to ensure the model's generalizability across different populations. Consistency in annotation is crucial to ensure reliable training and evaluation of machine learning models. [42] Multiple expert annotators should label the same dataset to assess inter-annotator agreement and address any discrepancies. Establishing clear guidelines and protocols for annotation can help improve consistency. To advance the field, healthcare institutions, dermatologists, and machine learning researchers can collaborate by collecting and sharing high-quality datasets.

Overfitting happens when a machine learning model becomes overly complex and adapts too closely to the training data, leading to poor performance when applied to new, unseen data. [58] Generalization ability refers to a model's ability to perform well on new, unseen data, beyond the training dataset. To avoid overfitting and enhance the model's ability to generalize, techniques such as cross-validation, early stopping, and regularization are employed. These methods help prevent the model from memorizing the training data too closely and encourage it to learn broader patterns that apply to new data. Additionally, increasing the size and diversity of the training data can also improve the generalization ability of the model.

Real-time systems are computer systems that have strict timing constraints and require timely and predictable responses to input data. [3] These systems are used in a variety of applications, including industrial control systems, transportation systems, and medical devices. In a real-time system, the response time is critical, and delays can lead to system failures or safety hazards. Real-time systems can be categorized into hard real-time and soft real-time systems. Hard real-time systems operate under stringent timing constraints where meeting deadlines is critical; failure to meet a deadline can result in system failure. Soft real-time systems have less strict timing constraints, and missing a deadline may not necessarily result in system failure but can lead to degraded system performance. Machine learning models used in real-time systems must be designed to meet these timing constraints and provide timely and accurate responses to input data. [59] This can be challenging due to the computational complexity of many machine learning algorithms. Techniques such as model compression, quantization, and pruning can be used to reduce the computational requirements of the model and improve its efficiency. Additionally, hardware acceleration can be used to speed up the inference process and reduce the response time of the system.

3.3 Solutions of Practical Application of Machine Learning in Dermatology

While machine learning still encounters various challenges in clinical skin cancer diagnosis, technological advancements and solution optimizations are steadily bridging the gap between the ideal and the real. The following solutions are proposed to advance the practical implementation of machine learning in dermatology:

3.3.1 Data Augmentation and Incremental Learning

Data Augmentation can be employed on skin lesion images to enhance the diversity and quantity of the training dataset. Through transformations such as rotation, flipping, scaling, and noise addition, augmented data can simulate variations that occur in real-world scenarios. This aids the machine learning model in learning robust features and improving its ability to generalize to unseen data. Data augmentation also addresses issues like class imbalance, where certain types of skin lesions may be underrepresented in the dataset.

Incremental Learning proves particularly useful in the field of skin cancer diagnosis due to the continuous availability of new data over time. As new cases are diagnosed or medical knowledge evolves, it becomes vital for the model to adapt continuously. Incremental learning enables the model to update its knowledge with new data without discarding previously learned information. This approach helps the model stay up-to-date with the latest patterns and characteristics of skin cancer, resulting in improved accuracy and performance. [43]

The combination of data augmentation and incremental learning offers several advantages to skin cancer diagnosis systems. Data augmentation enriches the diversity and quantity of training data, enhancing the model's learning efficiency. Incremental learning ensures that the model adapts to new cases and updates its knowledge, progressively enhancing accuracy. This integration ultimately leads to more reliable and accurate skin cancer diagnosis systems capable of keeping pace with evolving medical knowledge and delivering better patient care.

3.3.2 Ensemble Machine Learning Techniques

Ensemble machine learning techniques encompass methods that amalgamate multiple machine learning models to create a more potent and accurate model. [44] These techniques aim to enhance the performance of individual models by harnessing the collective power of multiple diverse models. By combining multiple models, ensemble techniques mitigate the limitations of individual models and elevate overall accuracy. They also bolster prediction robustness by diminishing the influence of outliers or noisy data points.

Three primary ensemble techniques have been widely applied to skin cancer detection: Bagging, Boosting, and Stacking.

Bagging entails training multiple models on different subsets of the training data and amalgamating their predictions through voting or averaging. This technique mitigates overfitting and elevates the model's generalization capability.

Boosting focuses on sequentially training models, with each subsequent model rectifying the errors made by the previous ones. This iterative process enhances the ensemble's overall accuracy.

Stacking necessitates training multiple models and employing their predictions as input to a meta-model, which delivers the final prediction. This technique capitalizes on the strengths of various models and can lead to improved performance.

Ensemble machine learning techniques have demonstrated their effectiveness in skin cancer detection tasks. However, the success of ensemble learning hinges on the diversity of the base learners. If the base learners are excessively similar or correlated, the ensemble model may not outperform individual models. [45] Thus, ensuring diversity among the base learners is pivotal for achieving superior performance in ensemble learning.

3.3.3 Feature Selection and Deep Learning

Deep learning models have demonstrated remarkable capabilities in autonomously learning valuable features from data. However, feature selection remains a crucial step in optimizing the performance of these models. Researchers can harness deep learning's representational power while also integrating manual feature selection techniques to further enhance their models' performance.

By amalgamating the strengths of automated feature learning and manual feature selection, researchers can attain a more precise and efficient feature representation. This approach facilitates the extraction of pertinent and discriminative features from the input data, thereby enhancing the model's capacity to capture critical patterns and characteristics related to skin cancer. [46] Manual feature selection techniques can pinpoint specific features known to be informative in skin cancer detection. Domain experts can contribute valuable insights and knowledge about the disease, enabling them to select features highly relevant to the task at hand. By incorporating these manually chosen features into deep learning models, researchers can augment the model's interpretability and potentially enhance its performance.

3.3.4 Multi-modal Analysis

Dermatologists frequently encounter a wealth of information sources, encompassing medical images and non-image data in the form of metadata (e.g., clinical or demographic information). Medical decision-making entails the integration and analysis of various facets of this information. To optimize the utilization of data from multiple sources, multi-modal fusion is being introduced for the detection and classification of skin cancer. [47] By amalgamating diverse data sources, healthcare professionals can make more informed decisions regarding patient care, ultimately leading to earlier detection, improved treatment, and enhanced survival rates.

Multi-modal analysis can be facilitated by machine learning algorithms capable of analyzing and integrating data from multiple sources to provide a more precise diagnosis. For instance, an automated diagnostic system can analyze clinical images of a skin lesion and amalgamate this information with the patient's medical history to deliver a more accurate diagnosis. The system can also examine genetic information to identify any genetic predispositions to skin cancer and offer personalized recommendations to the patient.

3.3.5 Ethical Considertation and Data Security

Ethical and data security issues are non-technical key considerations for machine learning in skin cancer applications. First of all, the data required for machine learning contains the biological information and privacy information of individuals, which is highly sensitive. The safeguarding of privacy is also a critical challenge, as large datasets used in machine learning may include sensitive information that could be misused or exposed through cyber-theft or accidental disclosure. Second, legality and ethical compliance are prerequisites for conducting medical research and data analysis, and non-compliant data use may lead to legal issues. [4] Thirdly, the use of ML in skin cancer detection must also consider the impact on healthcare staff employment and the nature of clinical work, as well as the effects of ML on health inequalities and access to medical care. Regulation and governance challenges need to be addressed to ensure that ML systems are developed and used responsibly, with a focus on patient safety and the protection of human rights. Furthermore, the integration of machine learning and big data analytics in skin cancer detection and treatment, raises concerns about the control, reliability, and trustworthiness of ML technology. The potential for ML to operate as "black box" systems can undermine transparency and patient autonomy, leading to a loss of control over medical decisions. [48]

Informed consent is a cornerstone of ethical medical research and must be obtained in a manner that respects the autonomy of the patient and provides them with sufficient information about how their data will be used. [4] This includes the potential risks, benefits, and the purpose of the data collection. Privacy protection must be rigorously enforced, ensuring that personal health information is secured against unauthorized access and breaches. This is especially relevant given the large datasets required for machine learning, which may contain highly sensitive personal

information.

To address these concerns, a transparent data use mechanism should be established. This mechanism would involve clear communication with patients about how their data will be used, stored, and shared. It would also include the implementation of strong data governance policies that comply with international standards, such as those issued by UNESCO and WHO, as well as national regulations like the Health Insurance Portability and Accountability Act (HIPAA) in the United States or the General Data Protection Regulation (GDPR) in the European Union. [49]

Moreover, the mechanism should allow for the ongoing monitoring of ML systems to ensure they are operating as intended and not perpetuating biases or errors that could harm patients. It should also provide a framework for accountability, where the responsibility for any issues arising from the use of ML in healthcare can be clearly defined and addressed.

3.3.6 Future direction

The future of machine learning in skin cancer detection holds great promise, with ongoing advancements in algorithm optimization, dataset expansion, automated diagnosis, and personalized treatment expected to significantly enhance patient outcomes and revolutionize the field of dermatology.

3.3.6.1 Algorithm Optimization

Algorithm optimization is crucial for improving the accuracy, robustness, and generalization capabilities of machine learning algorithms in skin cancer detection. By optimizing various aspects of algorithms, researchers and practitioners can elevate their performance. Some key strategies for algorithm optimization in skin cancer detection encompass hyperparameter optimization, architectural refinement, and more. Hyperparameters, not learned by the algorithm

itself but set by the user, can be fine-tuned using techniques such as grid search, random search, or Bayesian optimization to discover the optimal combination that maximizes performance. [50] Architectural optimization involves experimenting with different network architectures, layer configurations, activation functions, or optimization algorithms to identify the most effective setup for the specific skin cancer detection task. This may involve techniques like neural architecture search or automated machine learning for automated architecture exploration and optimization. By implementing these algorithm optimization strategies, researchers and practitioners can bolster the accuracy, robustness, and generalization capabilities of machine learning algorithms in skin cancer detection.

3.3.6.2 Expanding the Dataset

Advancing machine learning in skin cancer detection necessitates high-quality and diverse datasets. By prioritizing such datasets, researchers and practitioners can train more resilient, precise, and real-world-applicable machine learning models for skin cancer detection. [52] Collaborations with hospitals, clinics, and dermatology practices are crucial to accessing patient databases and collecting a wide range of skin cancer images. These collaborative efforts can lead to a larger and more diversified dataset, encompassing various demographics, skin types, and stages of skin cancer. Promoting the sharing of skin cancer datasets through open data initiatives can foster collaboration, expedite research, and facilitate the development of more accurate and broadly applicable machine learning models.

3.3.6.3 Automated Diagnosis

Automated diagnostics in skin cancer can advance early detection, reduce unnecessary biopsies, and aid healthcare professionals in making precise and timely diagnoses. [51] This entails developing fully automated skin cancer diagnosis systems capable of delivering accurate and dependable diagnoses without human intervention. It also involves integrating machine learning algorithms into clinical decision support systems to assist dermatologists in making more precise and prompt diagnoses. The utilization of deep learning models, capable of learning from extensive datasets and generalizing effectively to unseen samples, can result in more accurate and robust automated diagnosis systems.

3.3.6.4 Personalized Treatment

Personalized risk assessment can identify individuals at higher risk of developing skin cancer, enabling more frequent screenings and preventive measures. Machine learning algorithms can be leveraged to create personalized risk assessment models for skin cancer (Gordon, 2022). These models analyze individual patient data, encompassing demographics, sun exposure, medical history, family history, and genetic information, to identify risk factors for skin cancer. Subsequently, the algorithms provide a personalized risk score, aiding healthcare professionals in determining appropriate screening and prevention measures for each patient. The use of machine learning in developing personalized risk assessment models can improve early detection, reduce skin cancer incidence, and lead to better patient outcomes ultimately.

4 Conclusion

Machine learning, especially deep learning models like convolutional neural networks (CNNs), holds significant promise for improving early detection and diagnosis of skin cancer through image analysis. These algorithms have shown high accuracy in identifying skin cancer from images by learning intricate patterns and features from extensive datasets. As a result, they can classify skin lesions with accuracy levels that rival those of dermatologists.

The use of machine learning in dermatology offers several benefits. It can improve diagnostic efficiency by providing automated and objective assessments of skin lesions. This can help overcome the limitations of traditional diagnostic methods, which are time-consuming and rely on the expertise of dermatologists. Machine learning algorithms can also assist in lesion segmentation and localization, aiding in treatment planning and surgical guidance.

However, the practical application of machine learning in dermatology still suffers from the following main limitations. Such as data imbalance, the interpretability of models, and the requirement for large and diverse datasets. It is crucial for researchers and practitioners to ensure the availability of high-quality and representative datasets that can effectively train and evaluate machine learning models. Additionally, efforts should focus on enhancing the interpretability of deep learning models, as their decision-making processes can be complex and opaque. Addressing these challenges will be essential to furthering the reliable implementation of machine learning in dermatological applications.

Despite these limitations, machine learning continues to hold promise for the future of dermatology. Continued research and development in algorithm optimization, data augmentation, transfer learning, and multi-modal analysis will further improve the accuracy and applicability of machine learning models in skin cancer identification. These advancements have the potential to revolutionize the field of dermatology, leading to earlier detection, personalized treatment plans, and improved patient outcomes.

In summary, machine learning has the potential to significantly enhance the early detection and diagnosis of skin cancer. By leveraging advanced algorithms and techniques, researchers and practitioners can improve the accuracy, efficiency, and accessibility of skin cancer identification. Continued collaboration between machine learning experts, dermatologists, and healthcare institutions will be crucial in refining these techniques and translating them into practical applications that benefit patients and healthcare professionals alike.

References

[1]. U.S. Department of Health and Human Services. (2014). The Surgeon General’s call to action to prevent skin cancer. U.S. Dept of Health and Human Services, Office of the Surgeon General. Available at http://www.surgeongeneral.gov

[2]. Teuwen, J., & Moriakov, N. (2020). Chapter 20 - Convolutional neural networks. In S. K. Zhou, D. Rueckert, & G. Fichtinger (Eds.), Handbook of medical image computing and computer-assisted intervention (pp. 481-501). Academic Press. doi:10.1016/B978-0-12-816176-0.00025-9

[3]. Erciyes, K. (2019). Introduction to real-time systems. In Distributed real-time systems. Springer, Cham. doi:10.1007/978-3-030-22570-4_1

[4]. World Health Organization. (2021). Ethics and governance of artificial intelligence for health: WHO guidance. Geneva. e-ISBN: 978-92-4-002920-0. Available at https://iris.who.int/bitstream/handle/10665/341996/9789240029200-eng.pdf?sequence=1

[5]. Apalla, Z., Nashan, D., Weller, R. B., & others. (2017). Skin cancer: Epidemiology, disease burden, pathophysiology, diagnosis, and therapeutic approaches. Dermatology and Therapy (Heidelberg), 7(Suppl 1), 5–19. doi:10.1007/s13555-016-0165-y

[6]. Kourou, K., Exarchos, T. P., Exarchos, K. P., Karamouzis, M. V., & Fotiadis, D. I. (2014). Machine learning applications in cancer prognosis and prediction. Computational and Structural Biotechnology Journal, 13, 8-17. doi:10.1016/j.csbj.2014.11.005

[7]. Cheng, G., Yang, C., Yao, X., Guo, L., & Han, J. (2018). When deep learning meets metric learning: Remote sensing image scene classification via learning discriminative CNNs. IEEE Transactions on Geoscience and Remote Sensing, 56(5), 2811-2821. doi:10.1109/TGRS.2017.2783902

[8]. Naik, P. P., & Desai, M. B. (2022). Basal cell carcinoma: A narrative review on contemporary diagnosis and management. Oncology and Therapy, 10, 317-335. doi:10.1007/s40487-022-00201-8

[9]. Cameron, M. C., Lee, E., Hibler, B. P., Barker, C. A., Mori, S., Cordova, M., Nehal, K. S., & Rossi, A. M. (2019). Basal cell carcinoma: Epidemiology, pathophysiology, clinical and histological subtypes, and disease associations. Journal of the American Academy of Dermatology, 80(2), 303-317. doi:10.1016/j.jaad.2018.03.060

[10]. Tanese, K. (2019). Diagnosis and management of basal cell carcinoma. Current Treatment Options in Oncology, 20(2), 13. doi:10.1007/s11864-019-0610-0

[11]. Ferrante di Ruffano, L., Dinnes, J., Chuchu, N., Bayliss, S. E., Takwoingi, Y., Davenport, C., Matin, R. N., O'Sullivan, C., Roskell, D., Deeks, J. J., Williams, H. C., & Cochrane Skin Cancer Diagnostic Test Accuracy Group. (2018). Exfoliative cytology for diagnosing basal cell carcinoma and other skin cancers in adults. The Cochrane Database of Systematic Reviews, 12(12), CD013187. doi:10.1002/14651858.CD013187

[12]. Yanofsky, V. R., Mercer, S. E., & Phelps, R. G. (2011). Histopathological variants of cutaneous squamous cell carcinoma: A review. Journal of Skin Cancer, 2011, 210813. doi:10.1155/2011/210813

[13]. Combalia, A., & Carrera, C. (2020). Squamous cell carcinoma: An update on diagnosis and treatment. Dermatology Practical & Conceptual, 10(3), e2020066. doi:10.5826/dpc.1003a66

[14]. Tan, G., Liu, Y., & Sridevi, S. (2023). Computer vision on image recognition technology. In B. J. Jansen, Q. Zhou, & J. Ye (Eds.), Proceedings of the 2nd International Conference on Cognitive Based Information Processing and Applications (CIPA 2022) (pp. 994-999). Springer. doi:10.1007/978-981-19-9373-2_6

[15]. Andreopoulos, A., & Tsotsos, J. K. (2013). 50 years of object recognition: Directions forward. Computer Vision and Image Understanding, 117(8), 827-891. doi:10.1016/j.cviu.2013.04.005

[16]. Vaz, J. M., & Balaji, S. (2021). Convolutional neural networks (CNNs): Concepts and applications in pharmacogenomics. Molecular Diversity, 25(3), 1569–1584. doi:10.1007/s11030-021-10225-3

[17]. Hentschel, C., Wiradarma, T. P., & Sack, H. (2016). Fine tuning CNNs with scarce training data: Adapting imagenet to art epoch classification. In 2016 IEEE International Conference on Image Processing (ICIP) (pp. 3693-3697). Phoenix, AZ, USA. doi:10.1109/ICIP.2016.7533049

[18]. Han, X., Zhang, Z., Ding, N., Gu, Y., Liu, X., Huo, Y., Qiu, J., Yao, Y., Zhang, A., Zhang, L., Han, W., Huang, M., Jin, Q., Lan, Y., Liu, Y., Liu, Z., Lu, Z., Qiu, X., Song, R., Tang, J., Yuan, J., Zhao, W. X., & Zhu, J. (2021). Pre-trained models: Past, present and future. AI Open, 2, 225-250. doi:10.1016/j.aiopen.2021.08.002

[19]. Shaha, M., & Pawar, M. (2018). Transfer learning for image classification. In 2018 Second International Conference on Electronics, Communication and Aerospace Technology (ICECA) (pp. 656-660). Coimbatore, India. doi:10.1109/ICECA.2018.8474802

[20]. Li, Y. (2022). Research and application of deep learning in image recognition. In 2022 IEEE 2nd International Conference on Power, Electronics and Computer Applications (ICPECA) (pp. 994-999). Shenyang, China. doi:10.1109/ICPECA53709.2022.9718847

[21]. Li, L. (2020). Application of deep learning in image recognition. Journal of Physics: Conference Series, 1693, 012128. https://doi.org/10.1088/1742-6596/1693/1/012128

[22]. Qureshi, A. S., & Roos, T. (2021). Transfer learning with ensembles of deep neural networks for skin cancer detection in imbalanced data. arXiv Preprint arXiv:2103.12068. https://doi.org/10.48550/arXiv.2103.12068

[23]. Goyal, M., Knackstedt, T., Yan, S., & Hassanpour, S. (2020). Artificial intelligence-based image classification methods for diagnosis of skin cancer: Challenges and opportunities. Computers in Biology and Medicine, 127, 104065. https://doi.org/10.1016/j.compbiomed.2020.104065

[24]. Esteva, A., Kuprel, B., Novoa, R., et al. (2017). Dermatologist-level classification of skin cancer with deep neural networks. Nature, 542, 115–118. https://doi.org/10.1038/nature21056

[25]. Liu, C. F., Hsu, J., Xu, X., et al. (2021). Deep learning-based detection and segmentation of diffusion abnormalities in acute ischemic stroke. Communications Medicine, 1, 61. https://doi.org/10.1038/s43856-021-00062-8

[26]. Goyal, M., Oakley, A., Bansal, P., Dancey, D., & Yap, M. H. (2019). Skin lesion segmentation in dermoscopic images with ensemble deep learning methods. IEEE Access, 8, 4171-4181. https://doi.org/10.1109/ACCESS.2019.2960504

[27]. Das, K., Cockerell, C. J., Patil, A., Pietkiewicz, P., Giulini, M., Grabbe, S., & Goldust, M. (2021). Machine learning and its application in skin cancer. International Journal of Environmental Research and Public Health, 18(24), 13409. https://doi.org/10.3390/ijerph182413409

[28]. Guo, P., Luo, Y., Mai, G., Zhang, M., Wang, G., Zhao, M., Gao, L., Li, F., & Zhou, F. (2014). Gene expression profile-based classification models of psoriasis. Genomics, 103(1), 48–55. https://doi.org/10.1016/j.ygeno.2013.11.001

[29]. Abhishek, K., Kawahara, J., & Hamarneh, G. (2021). Predicting the clinical management of skin lesions using deep learning. Scientific Reports, 11, 7769. https://doi.org/10.1038/s41598-021-87064-7

[30]. Chan, S., Reddy, V., Myers, B., Thibodeaux, Q., Brownstone, N., & Liao, W. (2020). Machine learning in dermatology: Current applications, opportunities, and limitations. Dermatology and Therapy, 10(3), 365–386. https://doi.org/10.1007/s13555-020-00372-0

[31]. Deng, J., Dong, W., Socher, R., Li, L. J., Li, K., & Fei-Fei, L. (2009). ImageNet: A large-scale hierarchical image database. In 2009 IEEE Conference on Computer Vision and Pattern Recognition (pp. 248-255). Miami, FL.

[32]. Mendonça, T., Ferreira, P. M., Marques, J. S., Marcal, A. R. S., & Rozeira, J. (2013). PH2 – A dermoscopic image database for research and benchmarking. In 2013 35th Annual International Conference of the IEEE Engineering in Medicine and Biology Society (EMBC) (pp. 5437-5440). Osaka.

[33]. Tschandl, P., Rosendahl, C., & Kittler, H. (2018). The HAM10000 dataset, a large collection of multi-source dermatoscopic images of common pigmented skin lesions. Scientific Data, 5, 180161. https://doi.org/10.1038/sdata.2018.161

[34]. Rotemberg, V., Kurtansky, N., Betz-Stablein, B., Wong, K., Avraham, T., & Haedersdal, M. (2021). A patient-centric dataset of images and metadata for identifying melanomas using clinical context. Scientific Data, 8, 34. https://doi.org/10.1038/s41597-021-00823-7

[35]. Shetty, B., Fernandes, R., Rodrigues, A. P., et al. (2022). Skin lesion classification of dermoscopic images using machine learning and convolutional neural network. Scientific Reports, 12, 18134. https://doi.org/10.1038/s41598-022-22644-9

[36]. Salamaa, W. M., & Aly, M. H. (2021). Deep learning design for benign and malignant classification of skin lesions: A new approach. Multimedia Tools and Applications, 80, 26795-26811. https://doi.org/10.1007/s11042-021-11000-0

[37]. Martin-Barragan, B., Lillo, R., & Romo, J. (2014). Interpretable support vector machines for functional data. European Journal of Operational Research, 232(1), 146-155. https://doi.org/10.1016/j.ejor.2012.08.017

[38]. Cervantes, J., Garcia-Lamont, F., Rodríguez-Mazahua, L., & Lopez, A. (2020). A comprehensive survey on support vector machine classification: Applications, challenges and trends. Neurocomputing, 408, 189-215. https://doi.org/10.1016/j.neucom.2019.10.118

[39]. Sheth, V., Tripathi, U., & Sharma, A. (2022). A comparative analysis of machine learning algorithms for classification purpose. Procedia Computer Science, 215, 422-431. https://doi.org/10.1016/j.procs.2022.12.044

[40]. Dildar, M., Akram, S., Irfan, M., Khan, H. U., Ramzan, M., Mahmood, A. R., Alsaiari, S. A., Saeed, A. H. M., Alraddadi, M. O., & Mahnashi, M. H. (2021). Skin cancer detection: A review using deep learning techniques. International Journal of Environmental Research and Public Health, 18(10), 5479. https://doi.org/10.3390/ijerph18105479

[41]. Juan, C. K., Su, Y. H., Wu, C. Y., et al. (2023). Deep convolutional neural network with fusion strategy for skin cancer recognition: model development and validation. Scientific Reports, 13, 17087. https://doi.org/10.1038/s41598-023-42693-y

[42]. Fenza, G., Gallo, M., Loia, V., Orciuoli, F., & Herrera-Viedma, E. (2021). Data set quality in machine learning: Consistency measure based on group decision making. Applied Soft Computing, 106, 107366. https://doi.org/10.1016/j.asoc.2021.107366

[43]. Rezk, E., Eltorki, M., & El-Dakhakhni, W. (2023). Interpretable skin cancer classification based on incremental domain knowledge learning. Journal of Healthcare Informatics Research, 7, 59–83. https://doi.org/10.1007/s41666-023-00127-4

[44]. Avanija, J., Mohan Reddy, C. C., Chandan Reddy, C. S., Harshavardhan Reddy, D., Narasimhulu, T., & Hardhik, N. V. (2023). Skin cancer detection using ensemble learning. In 2023 International Conference on Sustainable Computing and Smart Systems (ICSCSS) (pp. 184–189). https://doi.org/10.1109/ICSCSS57650.2023.10169747

[45]. Mohammed, A., & Kora, R. (2023). A comprehensive review on ensemble deep learning: Opportunities and challenges. Journal of King Saud University - Computer and Information Sciences, 35(2), 757–774. https://doi.org/10.1016/j.jksuci.2023.01.014

[46]. Zhang, X., Wang, S., Liu, J., et al. (2018). Towards improving diagnosis of skin diseases by combining deep neural network and human knowledge. BMC Medical Informatics and Decision Making, 18(Suppl. 2), 59. https://doi.org/10.1186/s12911-018-0631-9

[47]. He, X., Wang, Y., Zhao, S., & Chen, X. (2023). Co-attention fusion network for multimodal skin cancer diagnosis. Pattern Recognition, 133, 108990. https://doi.org/10.1016/j.patcog.2022.108990

[48]. Castelvecchi, D. (2016). Can we open the black box of AI? Nature, 538, 20–23. https://doi.org/10.1038/538020

[49]. Rosemann, A., & Zhang, X. (2022). Exploring the social, ethical, legal, and responsibility dimensions of artificial intelligence for health – a new column in intelligent medicine. Intelligent Medicine, 2(2), 103–109. https://doi.org/10.1016/j.imed.2021.12.002

[50]. Bischl, B., Binder, M., Lang, M., et al. (2021). Hyperparameter optimization: Foundations, algorithms, best practices, and open challenges. Wiley Interdisciplinary Reviews: Data Mining and Knowledge Discovery, 13(1), e1402.

[51]. Li, L., Zhang, Q., Ding, Y., Jiang, H., Thiers, B. H., & Wang, J. Z. (2014). Automatic diagnosis of melanoma using machine learning methods on a spectroscopic system. BMC Medical Imaging, 14(1), 36. https://doi.org/10.1186/1471-2342-14-36

[52]. Wen, D., Khan, S. M., Xu, A. J., et al. (2022). Characteristics of publicly available skin cancer image datasets: A systematic review. The Lancet Digital Health, 4(1), e64–e74. https://doi.org/10.1016/S2589-7500(21)00252-1

[53]. World Health Organization. (2020). PRESS RELEASE No. 292. Retrieved June 27, 2023, from https://www.iarc.who.int/wp-content/uploads/2020/12/pr292_E.pdf

[54]. ONCOCARE (2023). Skin cancer - OncoCare Cancer Centre. Retrieved from https://oncocare.sg/zh-hans/%E7%9A%AE%E8%82%A4%E7%99%8C/

[55]. American Cancer Society. (2023). Basal and squamous cell skin cancer. Retrieved August 6, 2023, from https://www.cancer.org/cancer/types/basal-and-squamous-cell-skin-cancer.html

[56]. American Cancer Society. (2023). What is melanoma skin cancer. Retrieved August 6, 2023, from https://www.cancer.org/cancer/types/melanoma-skin-cancer/about/what-is-melanoma.html

[57]. American Cancer Society. (2023). Treating melanoma skin cancer. Retrieved August 6, 2023, from https://www.cancer.org/cancer/types/melanoma-skin-cancer/treating.html

[58]. Ruder, S. (2017). Transfer learning - Machine learning's next frontier. Retrieved July 16, 2023, from http://ruder.io/transfer-learning/

[59]. Brownlee, J. (2019). Overfitting and underfitting with machine learning algorithms. Machine Learning Mastery. Retrieved June 15, 2023, from https://machinelearningmastery.com/overfitting-and-underfitting-with-machine-learning-algorithms/

[60]. Zumel, N. (2022). Real-time data in machine learning: Challenges and solutions. DataVersity. Retrieved from https://www.dataversity.net/real-time-data-in-machine-learning-challenges-and-solutions/

[61]. Gordon, R. (2022). Seeing into the future: Personalized cancer screening with artificial intelligence. MIT CSAIL. Retrieved June 12, 2023, from https://news.mit.edu/2022/seeing-future-personalized-cancer-screening-artificial-intelligence-0121

Cite this article

Chen,S. (2024). Exploring the application of machine learning for skin cancer image identification. Advances in Engineering Innovation,11,1-14.

Data availability

The datasets used and/or analyzed during the current study will be available from the authors upon reasonable request.

Disclaimer/Publisher's Note

The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of EWA Publishing and/or the editor(s). EWA Publishing and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

About volume

Journal：Advances in Engineering Innovation

Volume number: Vol.11

ISSN：2977-3903(Print) / 2977-3911(Online)

© 2024 by the author(s). Licensee EWA Publishing, Oxford, UK. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license. Authors who publish this series agree to the following terms:
1. Authors retain copyright and grant the series right of first publication with the work simultaneously licensed under a Creative Commons Attribution License that allows others to share the work with an acknowledgment of the work's authorship and initial publication in this series.
2. Authors are able to enter into separate, additional contractual arrangements for the non-exclusive distribution of the series's published version of the work (e.g., post it to an institutional repository or publish it in a book), with an acknowledgment of its initial publication in this series.
3. Authors are permitted and encouraged to post their work online (e.g., in institutional repositories or on their website) prior to and during the submission process, as it can lead to productive exchanges, as well as earlier and greater citation of published work (See Open access policy for details).