
A CNN-Ensemble-Based Neural Network for Enhanced Classification of Single-Cell Bone Marrow Mononuclear Cell Types
- 1 Palo Alto High School
* Author to whom correspondence should be addressed.
Abstract
Single-cell RNA sequencing (scRNA-seq) offers an exceptional opportunity to uncover the mechanisms underlying complex diseases, such as cancer, at cellular resolution across diverse tissues. However, despite its potential, scRNA-seq faces considerable challenges, particularly in the accurate annotation of cell types due to inherent sequencing noise and the sparsity of gene expression data. To address these limitations, we have developed an advanced ensemble learning-based convolutional neural network (CNN) model specifically designed for the analysis of large-scale scRNA-seq data. Importantly, in a case study, we applied this model to classify subpopulations of bone marrow mononuclear cells using RNA transcript raw read counts from a dataset comprising 10032 samples of cells, 13822 genes, and 14 distinct cell types. Specifically, we conducted a comparative analysis of our model against other deep learning architectures, including MLP, LSTM, Attention mechanisms, as well as their ensemble models. Our results demonstrate that the CNN-based ensemble models consistently outperformed other networks, achieving optimal performance with a precision of 0.9143, an F1 score of 0.9143, and an accuracy of 0.9143, which represent significant improvements over the competing models. Moreover, visualization of the classification results using Umap highlights our model's capability in distinguishing cell types at cellular resolution. In conclusion, our CNN-based ensemble model not only demonstrates high efficacy in classifying bone marrow mononuclear cell types but also contributes a good approach to predictive modeling in the single-cell data analysis field.
Keywords
Single cell RNA-seq, Cell type identification, Deep learning, Ensemble learning
[1]. Brendel M, Su C, Bai Z, et al. Application of Deep Learning on Single-Cell RNA Sequencing Data Analysis: A Review. Genomics Proteomics Bioinformatics 2022; 20:814–835
[2]. Kuksin M, Morel D, Aglave M, et al. Applications of single-cell and bulk RNA sequencing in onco-immunology. Eur J Cancer 2021; 149:193–210
[3]. Ziegenhain C, Vieth B, Parekh S, et al. Comparative Analysis of Single-Cell RNA Sequencing Methods. Mol Cell 2017; 65:631-643.e4
[4]. Bao S, Li K, Yan C, et al. Deep learning-based advances and applications for single-cell RNA-sequencing data analysis. Brief Bioinform 2022; 23:
[5]. Wang T, Bai J, Nabavi S. Single-cell classification using graph convolutional networks. BMC Bioinformatics 2021; 22:364
[6]. Li S, Guo H, Zhang S, et al. Attention-based deep clustering method for scRNA-seq cell type identification. PLoS Comput Biol 2023; 19:e1011641
[7]. Wang X, Wang H, Liu D, et al. Deep learning using bulk RNA-seq data expands cell landscape identification in tumor microenvironment. Oncoimmunology 2022; 11:
[8]. Nguyen QT, Thanh LN, Hoang VT, et al. Bone Marrow-Derived Mononuclear Cells in the Treatment of Neurological Diseases: Knowns and Unknowns. Cell Mol Neurobiol 2023; 43:3211–3250
[9]. Huang H, Liu C, Wagle MM, et al. Evaluation of deep learning-based feature selection for single-cell RNA sequencing data analysis. Genome Biol 2023; 24:259
[10]. Flores M, Liu Z, Zhang T, et al. Deep learning tackles single-cell analysis—a survey of deep learning for scRNA-seq analysis. Brief Bioinform 2022; 23:
[11]. Lee J, Kim S, Hyun D, et al. Deep single-cell RNA-seq data clustering with graph prototypical contrastive learning. Bioinformatics 2023; 39:
[12]. Zhang X, Chen Z, Bhadani R, et al. NISC: Neural Network-Imputation for Single-Cell RNA Sequencing and Cell Type Clustering. Front Genet 2022; 13:
[13]. Jia S, Lysenko A, Boroevich KA, et al. scDeepInsight: a supervised cell-type identification method for scRNA-seq data with deep learning. Brief Bioinform 2023; 24:
[14]. Jiao L, Ren Y, Wang L, et al. MulCNN: An efficient and accurate deep learning method based on gene embedding for cell type identification in single-cell RNA-seq data. Front Genet 2023; 14:
[15]. Song T, Dai H, Wang S, et al. TransCluster: A Cell-Type Identification Method for single-cell RNA-Seq data using deep learning based on transformer. Front Genet 2022; 13:
[16]. Dong X, Chowdhury S, Victor U, et al. Semi-Supervised Deep Learning for Cell Type Identification From Single-Cell Transcriptomic Data. IEEE/ACM Trans Comput Biol Bioinform 2023; 20:1492–1505
[17]. Zhou Y, Peng M, Yang B, et al. scDLC: a deep learning framework to classify large sample single-cell RNA-seq data. BMC Genomics 2022; 23:504
[18]. Luecken MD, Burkhardt DB, Cannoodt R, et al. A sandbox for prediction and integration of DNA, RNA, and protein data in single cells. 2021;
[19]. Petegrosso R, Li Z, Kuang R. Machine learning and statistical methods for clustering single-cell RNA-sequencing data. Brief Bioinform 2020; 21:1209–1223
[20]. Xu F, Wang S, Dai X, et al. Ensemble learning models that predict surface protein abundance from single-cell multimodal omics data. Methods 2021; 189:65–73
Cite this article
Shi,C. (2024). A CNN-Ensemble-Based Neural Network for Enhanced Classification of Single-Cell Bone Marrow Mononuclear Cell Types. Applied and Computational Engineering,115,200-206.
Data availability
The datasets used and/or analyzed during the current study will be available from the authors upon reasonable request.
Disclaimer/Publisher's Note
The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of EWA Publishing and/or the editor(s). EWA Publishing and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.
About volume
Volume title: Proceedings of the 5th International Conference on Signal Processing and Machine Learning
© 2024 by the author(s). Licensee EWA Publishing, Oxford, UK. This article is an open access article distributed under the terms and
conditions of the Creative Commons Attribution (CC BY) license. Authors who
publish this series agree to the following terms:
1. Authors retain copyright and grant the series right of first publication with the work simultaneously licensed under a Creative Commons
Attribution License that allows others to share the work with an acknowledgment of the work's authorship and initial publication in this
series.
2. Authors are able to enter into separate, additional contractual arrangements for the non-exclusive distribution of the series's published
version of the work (e.g., post it to an institutional repository or publish it in a book), with an acknowledgment of its initial
publication in this series.
3. Authors are permitted and encouraged to post their work online (e.g., in institutional repositories or on their website) prior to and
during the submission process, as it can lead to productive exchanges, as well as earlier and greater citation of published work (See
Open access policy for details).