Diagnosis of Alzheimer’s Disease: Machine Learning and Deep Learning Approaches

Jiali Liu

doi:10.54254/2755-2721/2025.TJ23770

1. Introduction

Alzheimer’s disease (AD) is a kind of Progressive Neurodegenerative Disease, and its main affected population is humans with old age and pre-senile period, and the change of the characteristic pathology performances the cerebral cortex atrophy accompanied by β-amyloid deposition, the neurofibrillary tangles, the substantial reduce of the number of memory neurons and the formation of the age spots. And not found completely the causative factors, thus there is no feasible and effective core plan to deal with AD. However, at the same time, the patient population is substantial and continuously rising. Thus, research on pathological samples of AD plays an important role in breaking through the core of AD. Before Machine Learning (DL) became popular, it was widely applied in image recognition and classification tasks to use traditional Machine Learning (ML) such as Support Vector Machine (SVM), Random Forest (RF), K-Nearest Neighbors (KNN), however, the data relied on manual feature extraction. In contrast to DL algorithms represented by Convolutional Neural Networks (CNNs), this can automatically learn features of images, thereby reducing the workload of manual feature extraction. At present, it is a kind of important means to diagnose AD to use Magnetic Resonance Imaging (MRI) image classification methods based on the DL algorithm. The improved ML algorithm such as the XGBoost ML model, also can be used to make the prediction model of early AD patients. However, there are some problems with applying DL and ML to research on AD. For example, there are still problems on how to find out the underlying internal patterns and mechanisms beyond the massive, highly heterogeneous biological data, how to integrate data from different modalities or different cohort studies best, how to explain the results of DL and so on, which are also the important problems to constraint the further research on AD. Finally, this article aims to elucidate the significant importance of applying ML and DL to the study of AD by comparing and summarizing the current state of research in these areas.

2. Basic theory

Firstly, age is considered one of the important population elements in the pathological factors of AD [1]. On the one hand, it may result from a sleep disorder, and as the enhance of the age, the sleep time may decrease, on the other hand, the destruction of myelin and the loss of brain cells, and the loss will enhance the probability of occurrence of AD [2]. In addition, other diseases are also important factors that trigger the onset of AD. And some study shows, the metal elements of aluminum, zinc, mercury, copper, manganese, cadmium, and magnesium are the risk factors of AD [3], and the research on aluminum proves that it played the largest influence on AD [4]. Secondly, traumatic brain injury causes the blood-brain barrier disruption, thus resulting in plasma protein leakage to make the immune system sensitive to brain antigens isolated by the blood-brain barrier, thereby enhancing the probability of onset of AD [5]. And about the diagnosis basis of AD, according to the current standards, can be divided into three categories: Mental Health Screening, Image Screening and Biomarker Screening. Firstly, the current Mental Health Screening is the Montreal Cognitive Assessment (MoCA), Mini-Mental State Examination (MMSE), Neuropsychiatric Inventory (NPI), etc. The MMSE is most widely used at present, however, it is affected by many factors, such as the level of education and cultural background. It is insensitive to early-stage AD; The next is the image screening, the must-check ones in the current diagnosis and treatment plan are the Structural Imaging MRI and Computed Tomography (CT), and the core of treatment are Molecular Imaging of Beta-Amyloid PET (Aβ-PET) and so on.

In the field of scientific research, the functional imaging of AD patients is the function of MRI and diffusion tensor imaging. At the end of the passage is the biomarker screening, the most mainstream AD screening is the NIA-AA2018 research framework, and 75% of Global AD Clinical Trials are applied to the ATN Biomarker Classification System. In addition to the currently introduced AD screening means mentioned above, there are the newest AD diagnosis means such as the blood biomarker screening method represented by plasma p-tau217 [6].

3. Early diagnosis method

3.1. Method based on machine learning

Before the emergence and rise of deep learning， ML techniques such as SVM have played a significant role in handling small data volume samples within medical quantitative collections, particularly in the early image recognition and diagnosis of AD. Although early SVM-based methods performed well in processing small datasets, they exhibit weak generalization ability, poor interpretability, and a reliance on manual feature processing. Dhiya Al-Jumeily and colleagues utilized an SVM model, achieving high accuracy scores of 98.9% in binary classification (distinguishing NC and AD) and 90.7% in multi-class classification [7]. The core idea of the Random Forest-based AD algorithm is to integrate multiple decision trees, thereby enhancing its classification performance and the ability to classify by importance. Through this core concept, the research concludes that the early auxiliary diagnostic methods for Alzheimer's disease based on RF exhibit high robustness, which can reduce the risk of overfitting. Additionally, it supports multimodal data fusion, allowing for the simultaneous processing of multiple heterogeneous data types, such as MRI and apolipoprotein E (APOE). Moreover, it can rank features by importance, for example, by outputting the Gini importance score. However, there are still issues in clinical practice, such as data standardization, real-time requirements, and regulatory approval. The research conducted by Jianfeng Feng and others indicates that the four proteins GFAP, NEFL, GDF15, and LTBP2 can advance the early diagnosis of Alzheimer's disease by 15 years. Among these, the screening model that combines random forest with GFAP achieves a sensitivity of 92%, which can advance the diagnosis time of Alzheimer's disease by 5.2 years[8]. The AD diagnosis method based on logistic regression has the advantages of being highly interpretable and computationally efficient. It is also suitable for data processing with small sample size sets, provides probabilistic output, and offers visualization with greater depth. The main issues it faces are primarily threefold: insufficient capture of nonlinear relationships, weak ability to fit multimodal data, and limited vertical predictive capability. Steffen Flessa and colleagues established a screening model combining logistic regression with plasma p-tau217, which demonstrated 96% sensitivity and reduced the misdiagnosis rate of AD by 41% in community screening settings [9].

The core advantages of the early auxiliary diagnostic method for Alzheimer's disease based on clustering algorithms lie in its unsupervised discovery of subtypes, suitability for small sample datasets, and interpretable outputs. For instance, in the optimization of biomarker combinations, hierarchical clustering can automatically screen the best biological combination markers. Additionally, enhanced few-shot adaptation through flow learning further strengthens its applicability to small sample sizes. However, it still has three major defects: issues with the dissemination of stability, dynamic changes in biomarkers, and conflicts with current diagnostic standards, thus requiring manual review. Janani et al. developed a novel data interpretation method that identifies the best performance characteristics learned by deep learning models through clustering and perturbation analysis. The core idea of the early auxiliary diagnosis algorithm for AD based on Principal Component Analysis (PCA) lies in extracting the core bioactive compounds, thereby significantly enhancing the model's efficiency and interpretability. Consequently, in the process of diagnosing AD, this approach offers advantages such as reducing data compression by more than four orders of magnitude, eliminating multicollinearity between certification scales and imaging features, and providing a high level of visualization. However, it primarily faces three major issues: the loss of non-linear relationships, unclear clinical significance of components, and overfitting with small sample sizes [10].

3.2. Methods based on deep learning

The CNN-based early auxiliary diagnostic method for Alzheimer's disease overcomes the limitations of traditional manual labeling by adopting fully automatic feature extraction. It offers advantages including microscopic capture capability and multi-modal feature fusion, while also conducting in-depth analysis of intricate pathological correlations through high-dimensional nonlinear modeling techniques. The CNN can identify functional connectivity abnormalities in the default mode network and visual cortex – undetectable by traditional statistical methods – serving as diagnostic biomarkers for early Alzheimer's disease. Furthermore, it demonstrates unique strengths in small-sample training scenarios through strategies like the Transfer Learning Strategy. Janani et al. demonstrated that deep models outperform shallow models by employing a Stacked Denoising Autoencoder (SDAE) to extract features from clinical and genetic data, combined with a 3D Convolutional Neural Network (CNN) for data imaging. Their study also evidenced that integrating multimodal data achieves superior performance over single-modal models in terms of accuracy, precision, recall, and mean F1-score [10].

4. Current limitations and future prospects

4.1. Technology and data

When it comes to the integration of existing ML and DL technologies with early-stage AD diagnosis assistance, there are primarily data-related and algorithmic issues. Regarding data, challenges include: insufficient data sample size, risks of multimodal model failure, and erroneous annotations in clinical diagnosis. Insufficient data sample size can adversely affect model performance in algorithmic applications. Algorithmic challenges manifest as poor adaptability in few-shot learning scenarios and similar limitations.

4.2. Trends

In the future, the integration of DL with early diagnosis of AD may involve combining DL with MRI imaging results and biomarkers such as blood tests to identify early signs of AD that traditional methods cannot capture. Additionally, temporal DLNcould be used to analyze patients' long-term follow-up data to predict the progression of the disease. For example, in medical image analysis, future advancements could combine amyloid and tau protein PET scans, and then use spatiotemporal feature extraction networks to quantify abnormal deposition patterns. Furthermore, dynamic risk assessment methods, such as introducing causal models to distinguish the confounding effects of Alzheimer's disease-related variables from its derived conditions, could enhance the specificity of predictions. Future developments may also include the creation of hospital-assisted diagnostic systems, such as multimodal models that combine Aβ-PET, MRI, and blood biomarkers into a tripartite generative model, thereby reducing the time needed for diagnosis.

5. Conclusions

The use of machine learning and deep learning to assist in the early diagnosis of Alzheimer's disease has become increasingly popular. Compared to traditional methods that rely solely on manual diagnosis, these approaches have shown significant improvements in accuracy, standardization, time efficiency, and cost-effectiveness. This paper reviews the applicability of Machine learning and deep learning, discussing their respective advantages and disadvantages.

Despite the promising results, challenges such as limited data availability, the interpretability of models, and the variability of patient conditions remain significant obstacles to clinical adoption. Furthermore, deep learning models often require large, high-quality datasets, which are difficult to obtain in the medical field due to privacy and ethical concerns.

Future research should focus on improving data integration methods, enhancing model transparency, and promoting interdisciplinary collaboration between medical professionals and AI researchers. By addressing these challenges, intelligent diagnostic systems can become more reliable and practical for real-world clinical use. Ultimately, this work aims to provide a theoretical foundation and reference for the continued development of AI-assisted early diagnosis technologies for Alzheimer's disease.

References

[1]. Henderson, A. S. (1988). The risk factors for Alzheimer's disease: a review and a hypothesis. Acta Psychiatrica Scandinavica, 78(3), 257-275.

[2]. Yang Xiaomin, Bao Tianhao, & Ruan Ye. (2020). Overview of risk factors for Alzheimer's disease. Sichuan Mental Health, 33(6), 560–565.

[3]. Hsu, H. W., Bondy, S. C., & Kitazawa, M. (2018). Environmental and dietary exposure to copper and its cellular mechanisms linking to Alzheimer’s disease. Toxicological sciences, 163(2), 338-345.

[4]. Peters, S., Reid, A., Fritschi, L., De Klerk, N., & Musk, A. B. (2013). Long-term effects of aluminium dust inhalation. Occupational and environmental medicine, 70(12), 864-868.

[5]. A Armstrong R. Risk factors for Alzheimer's disease. Folia Neuropathol. 2019;57(2):87-105. doi: 10.5114/fn.2019.85929. PMID: 31556570.

[6]. Warmenhoven, N., Salvadó, G., Janelidze, S., Mattsson-Carlgren, N., Bali, D., Orduña Dolado, A., ... & Hansson, O. (2025). A comprehensive head-to-head comparison of key plasma phosphorylated tau 217 biomarker tests. Brain, 148(2), 416-431.

[7]. Alatrany, A. S., Khan, W., Hussain, A., Kolivand, H., & Al-Jumeily, D. (2024). An explainable machine learning approach for Alzheimer’s disease classification. Scientific reports, 14(1), 2637.

[8]. Guo, Y., You, J., Zhang, Y., Liu, W. S., Huang, Y. Y., Zhang, Y. R., ... & Yu, J. T. (2024). Plasma proteomic profiles predict future dementia in healthy adults. Nature Aging, 4(2), 247-260.

[9]. Scharf, A., Michalowsky, B., Rädke, A., Kleinke, F., Schade, S., Platen, M., ... & Hoffmann, W. (2025). Identifying and Addressing Unmet Needs in Dementia: The Role of Care Access and Psychosocial Support. International Journal of Geriatric Psychiatry, 40(4), e70066.

[10]. Venugopalan, J., Tong, L., Hassanzadeh, H. R., & Wang, M. D. (2021). Multimodal deep learning models for early detection of Alzheimer’s disease stage. Scientific reports, 11(1), 3254.

Cite this article

Liu,J. (2025). Diagnosis of Alzheimer’s Disease: Machine Learning and Deep Learning Approaches. Applied and Computational Engineering,166,1-5.

Data availability

The datasets used and/or analyzed during the current study will be available from the authors upon reasonable request.

Disclaimer/Publisher's Note

The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of EWA Publishing and/or the editor(s). EWA Publishing and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

About volume

Volume title: Proceedings of CONF-SEML 2025 Symposium: Machine Learning Theory and Applications

ISBN：978-1-80590-177-8(Print) / 978-1-80590-178-5(Online)

Editor：Hui-Rang Hou

Conference website: https://2025.confseml.org/tianjin.html

Conference date: 18 May 2025

Series: Applied and Computational Engineering

Volume number: Vol.166

ISSN：2755-2721(Print) / 2755-273X(Online)

© 2024 by the author(s). Licensee EWA Publishing, Oxford, UK. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license. Authors who publish this series agree to the following terms:
1. Authors retain copyright and grant the series right of first publication with the work simultaneously licensed under a Creative Commons Attribution License that allows others to share the work with an acknowledgment of the work's authorship and initial publication in this series.
2. Authors are able to enter into separate, additional contractual arrangements for the non-exclusive distribution of the series's published version of the work (e.g., post it to an institutional repository or publish it in a book), with an acknowledgment of its initial publication in this series.
3. Authors are permitted and encouraged to post their work online (e.g., in institutional repositories or on their website) prior to and during the submission process, as it can lead to productive exchanges, as well as earlier and greater citation of published work (See Open access policy for details).