Single Cell Type Prediction from Gene Profiles - An Overview of Different Computational Methods

Research Article
Open access

Single Cell Type Prediction from Gene Profiles - An Overview of Different Computational Methods

Jiaxun Li 1* , Xinyu Liu 2 , Kaijie Xu 3
  • 1 Department of Electrical Engineering & Computer Sciences (EECS)    
  • 2 2Steinhardt School of Culture, Education and Development, New York University    
  • 3 College of Letters & Science(L&S), University of California    
  • *corresponding author jiaxun1218@berkeley.edu
Published on 22 March 2023 | https://doi.org/10.54254/2755-2721/2/20220532
ACE Vol.2
ISSN (Print): 2755-273X
ISSN (Online): 2755-2721
ISBN (Print): 978-1-915371-19-5
ISBN (Online): 978-1-915371-20-1

Abstract

Proven to be useful in quantitative analysis of mRNA, scRNA-seq measures the individual gene expression profile and helps with rare cell population identification. Successful scRNA-seq analysis would be useful to boost knowledge of cancer cells and tumorigenesis, thus improving the ability to identify biomarkers and detect individuals’ disease susceptibility. This work conducted dimensionality reduction using naive principal component analysis. Then, several classification algorithms, including support vector machine, random forest, boosting, and neural networks, were examined with best hyperparameters determined by grid search. With the comparison of data dimensionality N=83 and N=127, each method generated the prediction accuracy of the dataset, with the support vector machine achieving the highest testing accuracy of 53.52%. The relatively high prediction accuracy enables better characterization of single gene expression profiles due to support vector machine’s ability to regularize high-dimensional data. Deeper architectures and usage of Bayesian optimization may further encourage efficient analysis of larger datasets with better classification accuracy.

Keywords:

scRNA-seq, Dimensionality Reduction, Neural Network, Support-vector Machine, Ensemble Learning

Li,J.;Liu,X.;Xu,K. (2023). Single Cell Type Prediction from Gene Profiles - An Overview of Different Computational Methods. Applied and Computational Engineering,2,765-773.
Export citation

References

[1]. Tirosh I, Izar B, Prakadan SM, Wadsworth MH, Treacy D, Trombetta JJ, et al. Dissecting the multicellular ecosystem of metastatic melanoma by single-cell RNA-seq. Science. 2016;352:189–96.

[2]. Haque, A., Engel, J., Teichmann, S.A. et al. A practical guide to single-cell RNA-sequencing for biomedical research and clinical applications. Genome Med 9, 75 (2017). https://doi.org/10.1186/s13073-017-0467-4

[3]. Wagner A, Regev A, Yosef N. Revealing the vectors of cellular identity with single-cell genomics. Nat Biotechnol. 2016;34:1145–60.

[4]. Tang, F., Barbacioru, C., Wang, Y. et al. mRNA-Seq whole-transcriptome analysis of a single cell. Nat Methods 6, 377–382 (2009). https://doi.org/10.1038/nmeth.1315

[5]. Jiang, Peng et al. “Quality control of single-cell RNA-seq by SinQC.” Bioinformatics (Oxford, England) vol. 32,16 (2016): 2514-6. doi:10.1093/bioinformatics/btw176

[6]. Bacher, R., Chu, LF., Leng, N. et al. SCnorm: robust normalization of single-cell RNA-seq data. Nat Methods 14, 584–586 (2017). https://doi.org/10.1038/nmeth.4263

[7]. Huang, M., Wang, J., Torre, E. et al. SAVER: gene expression recovery for single-cell RNA sequencing. Nat Methods 15, 539–542 (2018). https://doi.org/10.1038/s41592-018-0033-z

[8]. Sun, Guangshun et al. “Single-cell RNA sequencing in cancer: Applications, advances, and emerging challenges.” Molecular therapy oncolytics vol. 21 183-206. 8 May. 2021, doi:10.1016/j.omto.2021.04.001

[9]. Chieh Lin, Siddhartha Jain, Hannah Kim, Ziv Bar-Joseph, Using neural networks for reducing the dimensions of single-cell RNA-Seq data, Nucleic Acids Research, Volume 45, Issue 17, 29 September 2017, Page e156, https://doi.org/10.1093/nar/gkx681

[10]. Xu, D., Zhang, J., Xu, H. et al. Multi-scale supervised clustering-based feature selection for tumor classification and identification of biomarkers and targets on genomic data. BMC Genomics 21, 650 (2020). https://doi.org/10.1186/s12864-020-07038-3


Cite this article

Li,J.;Liu,X.;Xu,K. (2023). Single Cell Type Prediction from Gene Profiles - An Overview of Different Computational Methods. Applied and Computational Engineering,2,765-773.

Data availability

The datasets used and/or analyzed during the current study will be available from the authors upon reasonable request.

Disclaimer/Publisher's Note

The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of EWA Publishing and/or the editor(s). EWA Publishing and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

About volume

Volume title: Proceedings of the 4th International Conference on Computing and Data Science (CONF-CDS 2022)

ISBN:978-1-915371-19-5(Print) / 978-1-915371-20-1(Online)
Editor:Alan Wang
Conference website: https://www.confcds.org/
Conference date: 16 July 2022
Series: Applied and Computational Engineering
Volume number: Vol.2
ISSN:2755-2721(Print) / 2755-273X(Online)

© 2024 by the author(s). Licensee EWA Publishing, Oxford, UK. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license. Authors who publish this series agree to the following terms:
1. Authors retain copyright and grant the series right of first publication with the work simultaneously licensed under a Creative Commons Attribution License that allows others to share the work with an acknowledgment of the work's authorship and initial publication in this series.
2. Authors are able to enter into separate, additional contractual arrangements for the non-exclusive distribution of the series's published version of the work (e.g., post it to an institutional repository or publish it in a book), with an acknowledgment of its initial publication in this series.
3. Authors are permitted and encouraged to post their work online (e.g., in institutional repositories or on their website) prior to and during the submission process, as it can lead to productive exchanges, as well as earlier and greater citation of published work (See Open access policy for details).

References

[1]. Tirosh I, Izar B, Prakadan SM, Wadsworth MH, Treacy D, Trombetta JJ, et al. Dissecting the multicellular ecosystem of metastatic melanoma by single-cell RNA-seq. Science. 2016;352:189–96.

[2]. Haque, A., Engel, J., Teichmann, S.A. et al. A practical guide to single-cell RNA-sequencing for biomedical research and clinical applications. Genome Med 9, 75 (2017). https://doi.org/10.1186/s13073-017-0467-4

[3]. Wagner A, Regev A, Yosef N. Revealing the vectors of cellular identity with single-cell genomics. Nat Biotechnol. 2016;34:1145–60.

[4]. Tang, F., Barbacioru, C., Wang, Y. et al. mRNA-Seq whole-transcriptome analysis of a single cell. Nat Methods 6, 377–382 (2009). https://doi.org/10.1038/nmeth.1315

[5]. Jiang, Peng et al. “Quality control of single-cell RNA-seq by SinQC.” Bioinformatics (Oxford, England) vol. 32,16 (2016): 2514-6. doi:10.1093/bioinformatics/btw176

[6]. Bacher, R., Chu, LF., Leng, N. et al. SCnorm: robust normalization of single-cell RNA-seq data. Nat Methods 14, 584–586 (2017). https://doi.org/10.1038/nmeth.4263

[7]. Huang, M., Wang, J., Torre, E. et al. SAVER: gene expression recovery for single-cell RNA sequencing. Nat Methods 15, 539–542 (2018). https://doi.org/10.1038/s41592-018-0033-z

[8]. Sun, Guangshun et al. “Single-cell RNA sequencing in cancer: Applications, advances, and emerging challenges.” Molecular therapy oncolytics vol. 21 183-206. 8 May. 2021, doi:10.1016/j.omto.2021.04.001

[9]. Chieh Lin, Siddhartha Jain, Hannah Kim, Ziv Bar-Joseph, Using neural networks for reducing the dimensions of single-cell RNA-Seq data, Nucleic Acids Research, Volume 45, Issue 17, 29 September 2017, Page e156, https://doi.org/10.1093/nar/gkx681

[10]. Xu, D., Zhang, J., Xu, H. et al. Multi-scale supervised clustering-based feature selection for tumor classification and identification of biomarkers and targets on genomic data. BMC Genomics 21, 650 (2020). https://doi.org/10.1186/s12864-020-07038-3