Research Article
Open access
Published on 27 August 2024
Download pdf
Zhan,B. (2024). Application and investigation of knowledge graph in biomedical field. Applied and Computational Engineering,88,81-87.
Export citation

Application and investigation of knowledge graph in biomedical field

Boyi Zhan *,1,
  • 1 Safety Science and Engineering, Southwest Jiaotong University, Chengdu, 611730, China

* Author to whom correspondence should be addressed.

https://doi.org/10.54254/2755-2721/88/20241572

Abstract

A large amount of drug and disease research knowledge is scattered in unstructured literature data, presenting significant challenges in text mining within the field of biomedicine. These challenges include handling professional knowledge, integrating related knowledge, and disambiguating different meanings of the same words. Therefore, constructing a biomedical knowledge graph can significantly save expert human resources and make efficient use of medical literature resources. This review paper aims to summarize the construction methods used during the development of Biomedical Knowledge Graphs. It also outlines the latest models and frameworks, such as BioBERT and LSTM+CRF, highlighting their contributions and applications. In addition, this paper points out the limitations of current biomedical knowledge graphs, such as scalability issues and the need for large annotated datasets. To address these limitations, it proposes the use of Apache Spark for improved processing capabilities and transfer learning to enhance model performance and adaptability in diverse biomedical contexts.

Keywords

Big data, biomedical knowledge map, information extraction

[1]. National Library of Medicine. 2024. Preview Upcoming Improvements to PubMed Central (PMC). NLM Technical Bulletin, Mar-Apr, vol 457.

[2]. DiMasi JA, Grabowski HG & Hansen RW. 2016. Innovation in the pharmaceutical industry: New estimates of R&D costs. Journal of Health Economics, vol 47, pp 20-33.

[3]. Hogan A, Blomqvist E, Cochez M, d'Amato C, de Melo G, Gutiérrez C, et al. 2021. Knowledge graphs. ACM Computing Surveys (CSUR), vol 54 (4), pp 1-37.

[4]. Lee J, Yoon W, Kim S, Kim D, Kim S, So CH, Kang J. 2020. BioBERT: a pre-trained biomedical language representation model for biomedical text mining. Bioinformatics, vol 36 (4), pp 1234-1240.

[5]. Zeng X, Zhu S, Liu X, Zhou Y, Nussinov R, Cheng F. 2019. DeepDR: A network-based deep learning approach to in silico drug repositioning. Nature Communications, vol 11 (1), p 1-13.

[6]. Zhang W, Chen Y, Liu F. 2018. Predicting drug-disease associations by using similarity constrained matrix factorization. BMC Bioinformatics, vol 19 (1), p 1-11. doi:10.1186/s12859-018-2290-9.

[7]. Lu Z. 2011. PubMed and beyond: a survey of web tools for searching biomedical literature. Database, vol 2011, baq036.

[8]. Chen L, Liu H, Friedman C. 2005. Gene name ambiguity of eukaryotic nomenclatures. Bioinformatics, Volume 21, Issue 2, January 2005, Pages 248–256.

[9]. Soomro PD, Kumar S, Banbhrani, Shaikh AA, Raj HR. 2017. Bio-NER: Biomedical Named Entity Recognition using Rule-Based and Statistical Learners. International Journal of Advanced Computer Science and Applications (IJACSA), 8(12). http://dx.doi.org/10.14569/IJACSA.2017.081220.

[10]. Huang Z, Xu W, Yu K. 2015. Bidirectional LSTM-CRF models for sequence tagging. arXiv preprint arXiv:1508.01991.

[11]. Han X, Gao T, Yao Y, Ye D, Liu Z, Sun M. 2019. OpenNRE: An Open and Extensible Toolkit for Neural Relation Extraction. In: Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing (EMNLP), pp 169-174. doi:10.18653/v1/D19-3029.

[12]. Wadden D, Wennberg U, Luan Y, Hajishirzi H. 2019. Entity, Relation, and Event Extraction with Contextualized Span Representations. In: Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP), Association for Computational Linguistics, pp 5784-5789.

[13]. Kim JD, Ohta T, Pyysalo S, Kano Y, Tsujii J. 2009. Overview of BioNLP’09 Shared Task on Event Extraction. In: Proceedings of the BioNLP 2009 Workshop Companion Volume for Shared Task, Boulder, Colorado, pp 1-9. Association for Computational Linguistics.

[14]. Moon C, Jin C, Dong X, Abrar S, Zheng W, Chirkova RY, Tropsha A. 2021. Learning Drug-Disease-Target Embedding (DDTE) from knowledge graphs to inform drug repositioning hypotheses. Journal of Biomedical Informatics, Jul;119:103838. doi: 10.1016/j.jbi.2021.103838. Epub 2021 Jun 11. PMID: 34119691.

[15]. Hoehndorf R, Schofield PN, Gkoutos GV. 2015. The role of ontologies in biological and biomedical research: a functional perspective. Brief Bioinform, Nov;16(6):1069-80. doi: 10.1093/bib/bbv011. Epub 2015 Apr 10. PMID: 25863278; PMCID: PMC4652617.

[16]. Yamada I, Washio K, Shindo H, Matsumoto Y. 2022. Global Entity Disambiguation with BERT. In: Proceedings of the 2022 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, pages 3264–3271, Seattle, United States. Association for Computational Linguistics.

[17]. Guia J, Soares VG, Bernardino J. 2017. Graph Databases: Neo4j Analysis. In ICEIS (1), pp. 351-356.

[18]. Siemer S. 2019. Exploring the Apache Jena framework. George August University: Göttingen, Germany.

[19]. Jourdan JP, Bureau R, Rochais C, Dallemagne P. 2020. Drug repositioning: a brief overview. Journal of Pharmacy and Pharmacology, 72(9), pp. 1145-1151.

[20]. Obata H. 2017. Analgesic mechanisms of antidepressants for neuropathic pain. International Journal of Molecular Sciences, 18(11), 2483.

[21]. Guo R, Zhao Y, Zou Q, Fang X, Peng S. 2018. Bioinformatics applications on Apache Spark. GigaScience, Volume 7, Issue 8, August 2018, giy098.

[22]. Agarwal N, Sondhi A, Chopra K, Singh G. 2021. Transfer learning: Survey and classification. Smart Innovations in Communication and Computational Sciences: Proceedings of ICSICCS 2020, pp. 145-155.

Cite this article

Zhan,B. (2024). Application and investigation of knowledge graph in biomedical field. Applied and Computational Engineering,88,81-87.

Data availability

The datasets used and/or analyzed during the current study will be available from the authors upon reasonable request.

Disclaimer/Publisher's Note

The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of EWA Publishing and/or the editor(s). EWA Publishing and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

About volume

Volume title: Proceedings of the 6th International Conference on Computing and Data Science

Conference website: https://2024.confcds.org/
ISBN:978-1-83558-603-7(Print) / 978-1-83558-604-4(Online)
Conference date: 12 September 2024
Editor:Alan Wang, Roman Bauer
Series: Applied and Computational Engineering
Volume number: Vol.88
ISSN:2755-2721(Print) / 2755-273X(Online)

© 2024 by the author(s). Licensee EWA Publishing, Oxford, UK. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license. Authors who publish this series agree to the following terms:
1. Authors retain copyright and grant the series right of first publication with the work simultaneously licensed under a Creative Commons Attribution License that allows others to share the work with an acknowledgment of the work's authorship and initial publication in this series.
2. Authors are able to enter into separate, additional contractual arrangements for the non-exclusive distribution of the series's published version of the work (e.g., post it to an institutional repository or publish it in a book), with an acknowledgment of its initial publication in this series.
3. Authors are permitted and encouraged to post their work online (e.g., in institutional repositories or on their website) prior to and during the submission process, as it can lead to productive exchanges, as well as earlier and greater citation of published work (See Open access policy for details).