Leveraging Computational Algorithms for Effective Explicit and Tacit Knowledge Capture: A Hybrid Approach Combining Expert Interviews, Machine Learning, and Data Mining Techniques

Zuowei Li

doi:10.54254/2755-2721/2024.18200

1. Introduction

Knowledge management has become an important part of business strategy, as businesses realize how important it is to store and share knowledge in order to stay ahead. Both explicit knowledge, which can be written down easily, and tacit knowledge, which is more difficult to write down, are crucial. Informed consent was typically obtained through interviews and questionnaires. Yet these techniques usually cannot handle large quantities of high-dimensional data. New possibilities to effectively systematically collect knowledge emerged through machine learning and data mining algorithms. But, in fact, capturing implicit and explicit knowledge through these technologies is challenging since algorithmic systems do not always draw on subtle lessons from human expertise.

This work has mainly focussed on using human intervention in combination with algorithms to increase the precision and effectiveness of knowledge capture. Machine learning and data mining provide deep analytics, but they don’t provide a holistic picture of context-independent, sophisticated knowledge that experts possess. In this work, we intend to build and test an open-source combination that brings together the power of both human know-how and computation algorithms for a more comprehensive knowledge-generation system. This paper aims to: first, determine whether the hybrid approach of expert interviews combined with machine learning and data mining tools can be used to capture explicit and tacit knowledge; second, find out which algorithms make it easier to acquire knowledge; and third, determine what these two approaches can contribute to organizational knowledge management. This research adds value to both academia and industry, introducing a systematic model for knowledge capture that is scalable and flexible.

2. Literature Review

2.1. Knowledge Capture Techniques

Knowledge capture studies have long relied on interviews, surveys and observation as key ways to capture knowledge – both in tacit and in explicit form. Interviews with professionals offer one of the most powerful means to gather knowledge, but they are also time-consuming and subjectivity-constrained. Over the past few years, we’ve been moving towards automation, with machine learning and data mining as the mainstay. Machine learning can be used to predict patterns and generate predictive functionality that can increase knowledge capture scalability, and data mining is useful for finding patterns and correlations in large-scale data sets. But neither of these approaches is sufficient to capture the whole envelopment of knowledge: both explicit and implicit information requires different extraction processes [1]. Combining these techniques could provide deeper, more reliable representations of knowledge.

2.2. Machine Learning and Data Mining in Knowledge Management

Machine learning and data mining are widely used in knowledge management because they help to detect patterns, correlations, and insights across a lot of structured and unstructured data. As part of knowledge management, Machine learning can be used to capture knowledge by looking at historical data and extracting predictions, thereby making decisions in line with the pattern from the past. For instance, classification algorithms, including decision trees and SVMs, can categorize explicit knowledge by detecting relationships and patterns in clearly defined data points. Regression algorithms are also highly efficient in making predictions from the historical data over time and thus they are useful for explicit learning. Such algorithms enable organizations to learn more about their data, make predictions for the future and come up with useful insights that aid explicit knowledge storage and access [2].

Unsupervised learning approaches, by contrast, focus on finding patterns in unlabelled or unstructured data and are ideal for tacit knowledge analysis. Clustering algorithms (like K-means or hierarchical clustering) are used to sort data points based on their properties, uncovering hidden patterns or commonalities that may be not obvious. For example, clustering can recognize customer segments based on behaviour, preferences, and other implicit characteristics, making sense of more structured, unstructured data. Another unsupervised learning approach is association analysis, which can be particularly useful for finding correlations between variables in large data sets like the buying habits of products, often used words in the same text or associative knowledge. They’re crucial to discerning tacit knowledge because they provide hidden associations and contexts that would otherwise be opaque.

But adding such algorithms to a knowledge management system poses special challenges. A fundamental concern is data heterogeneity as knowledge sources often involve both structured data (for example, transaction records) and unstructured data (such as documents or transcripts of interviews). This makes careful data preprocessing essential to ensure the data in machine learning models is always consistent, current and readable. This preprocessing step could include data cleaning, normalization, feature extraction, and transformation to make sure that the machine learning model will handle the data correctly. What’s more, model choice and hyperparameter tuning are both key to enabling these algorithms to work efficiently across different knowledge management scenarios, since different data structures and business requirements might require different setups [3].

2.3. Hybrid Approaches to Knowledge Acquisition

Combining the human expertise with computer techniques have been a promising trend. They are conscious of the fact that human perspectives — expert interviews — add context, while algorithms provide scalability and consistency in data mining. As research has already reported, hybrid approaches lead to greater decision accuracy by leveraging the power of both qualitative and quantitative data [4]. Expert systems, for example, which have feedback loops between human intuition and algorithmic computation, have shown to be more accurate in complex environments. In this article, we extend these experiments by trying out the hybrid approach in a framework and exploring its knowledge-gathering effectiveness.

3. Methodology

3.1. Data Collection

The two primary data sources of this study were expert interviews and institutional databases. We conducted expert interviews with 30 experts from all areas to gather implicit knowledge that would otherwise be hard to capture. These interviews were structured and semi-structured, allowing for time for a detailed response. At the same time, massive amounts of explicit knowledge were extracted from organisational databases such as written reports, operating manuals and records. Integration of these two data sources involved pre-processing of data through data cleaning, normalization, and tagging to make it compatible for algorithmic analysis [5].

3.2. Machine Learning Models

For the analysis of explicit knowledge, supervised learning algorithms, such as decision trees and support vector machines (SVM), were applied to classify and predict patterns within structured datasets. In particular, the SVM model used the objective of minimizing the loss function while maximizing the margin between classes. The optimization function for SVM can be formulated as follows:

\( \underset{w,b}{min}\frac{1}{2}∥w{∥^{2}}+C\sum _{i=1}^{n}max(0,1-{y_{i}}(w\cdot {x_{i}}+b)) \) (1)

where w represents the weight vector, \( b \) is the bias term, \( {y_{i}} \) is the class label for each instance, \( {x_{i}} \) is the feature vector, and \( C \) is a regularization parameter that balances margin maximization and error minimization [6]. For tacit knowledge, unsupervised learning methods specifically clustering and topic modeling, were applied to identify latent themes and underlying relationships within interview transcripts. Each model was trained on 80% of the data, reserving 20% for testing, and was evaluated using metrics such as accuracy, precision, and recall to assess classification and clustering quality. Additionally, feature engineering techniques were employed to enhance model interpretability, allowing for a more nuanced analysis of tacit knowledge dimensions [7].

3.3. Data Mining Techniques

Data mining algorithms like association rule mining and sequence analysis were employed to detect patterns in the explicit knowledge datasets. These methods allowed connections and connections that would otherwise have been missed to become apparent, making captured knowledge more rich. Association rule mining discovered connections between knowledge items, and sequence analysis revealed how and why knowledge patterns occur. Data mining findings were compared with data from expert interviews to confirm their validity and applicability to organisational knowledge [8].

4. Results and Discussion

4.1. Explicit Knowledge Capture Results

The supervised learning models showed robust performance in capturing explicit knowledge, with notable accuracy levels across different algorithms. As shown in Table 1, the decision tree model achieved an accuracy of 92%, along with a precision of 88% and recall of 90%, highlighting its strong performance in pattern recognition within structured datasets. Similarly, the support vector machine (SVM) model reached an accuracy of 89%, with a precision of 85% and recall of 87%, indicating its effectiveness in classifying explicit knowledge categories [9]. These results underline key trends and categories relevant to organizational processes, which were validated against expert feedback. The high consistency in findings across models supports the feasibility and reliability of using machine learning for explicit knowledge capture in organizational contexts.

Table 1. Performance Metrics for Explicit Knowledge Capture Models

Model	Accuracy	Precision	Recall	F1 Score
Decision Tree	92%	88%	90%	89%
Support Vector Machine (SVM)	89%	85%	87%	86%
Random Forest	91%	87%	89%	88%
Logistic Regression	88%	83%	85%	84%

4.2. Tacit Knowledge Capture Results

Clustering and topic modelling on interview data were able to uncover some of the hidden themes related to organizational culture, decision making, personal knowledge and other key components. As depicted in Figure 1, clustering identified five main clusters with different knowledge domains, namely problem-solving and customer relationship management. In each cluster, topic modelling added a layer of complexity, and sub-themes were identified to highlight specific insight, including organisational culture and problem solving expertise. The "Problem-Solving" cluster, for example, had the most sub-themes (12), a sign that the data included a wide variety of problem-solving techniques. This review shows that machine learning models can classify tacit knowledge effectively when combined with expert expertise, and hence aid in making sense of large amounts of unstructured data.

/word/media/image1.png

Figure 1. Distribution of Sub-Themes Across Tacit Knowledge Clusters

4.3. Comparative Analysis of Hybrid Approach

Combining expert interviews with computational algorithms proved more effective than a single-take approach at describing knowledge depth and precision. As shown in Table 2, the hybrid method had a 15 percent improvement on knowledge classification, which was 94% accurate (compared with 79% accuracy for expert interviews and 85% for standalone machine learning models). Data interpretability was also increased by 20% on the hybrid platform (interpreted data scores at 87%, compared with expert interviews at 65% and machine learning at 72%) [10]. Such results suggest that combining computational and human-centred techniques improves both the breadth of knowledge extraction and the ability of organizations to derive useful information and create stronger decision-making models.

Table 2. Comparative Analysis of Hybrid and Individual Knowledge Capture Methods

Method	Accuracy (%)	Depth (%)	Interpretability (%)
Expert Interviews Only	79	70	65
Machine Learning Only	85	78	72
Hybrid Approach (Expert Interviews + Machine Learning)	94	90	87

4.4. Practical Implications of the Hybrid Approach

These findings suggest that there is much to be gained from using a hybrid approach to capture knowledge in a business context where both explicit and implicit knowledge are important. Combining expert-based interviews with machine learning and data mining, it overcomes some of the limitations of traditional knowledge management platforms. In particular, the hybrid model allows processing larger, more elaborate data sets with a way to retain some of the subtle insights that are characteristic of human knowledge. Of primary concern is a change in organisational decision making. Its hybrid format allows for the extraction of contextual, implicit knowledge that is often necessary to inform strategic decisions, but which is impossible to measure or assemble through automated processing on its own. For example, gaining in-depth insights into employees’ views, customers and complex processes that come from this methodology can yield smarter, more adaptive and innovative decision-making models. Additionally, the improved interpretability and precision found in this study suggests that knowledge stores built using hybrid techniques are more easily accessible and practical for all organizational stakeholders, from executives to field workers [11].

Second, the hybrid strategy has important implications for employee training and development. With a good capture and storage of both formal and informal knowledge, companies can build more relevant, more comprehensive knowledge bases to help new employees quickly navigate more sophisticated workflows and cultures. This model can also enable continuous learning throughout the organization by enriching knowledge bases with new information from both human professionals and data. Thus, the organization creates a more flexible and robust workforce, who can easily respond to the changing industry landscape.

Lastly, for industries like healthcare, financial and engineering, where regulatory compliance requires greater precision and traceability, the hybrid model provides a secure way to capture both operational and professional knowledge. Structured data analysis coupled with context help companies not only achieve regulatory compliance requirements but also gain greater strategic knowledge management. The pragmatic application of the hybrid solution supports compliance and innovation, and makes it applicable to multiple industries.

5. Conclusion

This study shows how a combination of expert interviews, machine learning, and data mining can help organizations capture knowledge effectively. These results demonstrate significant gains in accuracy and interpretability over either approach alone, which proves the hybrid method to be a much more holistic and adaptive means of knowledge acquisition. Human knowledge and computational expertise combined make it possible for companies to encapsulate a wider spectrum of explicit and implicit knowledge in order to deliver better decision-making environments and enable strategic growth. The hybrid approach is equally effective in industries that need to get accurate data — for example, in healthcare, banking, and engineering. By taking the challenges of traditional knowledge management solutions on the ground, it enhances organizational agility, compliance and productivity. In addition, the expanded knowledge base created as a result of this strategy facilitates employee learning, education, and innovation. Further work should look into how we can best combine human and computational inputs to knowledge extraction and test the scalability of the technique in a wide variety of industries and data environments.

References

[1]. Al-Huda, Zaid, et al. "A hybrid deep learning pavement crack semantic segmentation." Engineering Applications of Artificial Intelligence 122 (2023): 106142.

[2]. Farbiz, Farzam, et al. "Knowledge-embedded machine learning and its applications in smart manufacturing." Journal of Intelligent Manufacturing 34.7 (2023): 2889-2906.

[3]. Terbuch, Anika, et al. "Detecting anomalous multivariate time-series via hybrid machine learning." IEEE transactions on instrumentation and measurement 72 (2023): 1-11.

[4]. Rajpoot, Vikram, Akhilesh Tiwari, and Anand Singh Jalal. "Automatic early detection of rice leaf diseases using hybrid deep learning and machine learning methods." Multimedia Tools and Applications 82.23 (2023): 36091-36117.

[5]. Abbaszadeh Shahri, Abbas, Shan Chunling, and Stefan Larsson. "A hybrid ensemble-based automated deep learning approach to generate 3D geo-models and uncertainty analysis." Engineering with Computers 40.3 (2024): 1501-1516.

[6]. Sharmin, Selina, et al. "A hybrid dependable deep feature extraction and ensemble-based machine learning approach for breast cancer detection." IEEE Access (2023).

[7]. Li, Jiating, et al. "Improved chlorophyll and water content estimations at leaf level with a hybrid radiative transfer and machine learning model." Computers and Electronics in Agriculture 206 (2023): 107669.

[8]. Kanti, Praveen Kumar, et al. "The effect of pH on stability and thermal performance of graphene oxide and copper oxide hybrid nanofluids for heat transfer applications: application of novel machine learning technique." Journal of Energy Chemistry 82 (2023): 359-374.

[9]. Mohanty, Cheena, et al. "Using deep learning architectures for detection and classification of diabetic retinopathy." Sensors 23.12 (2023): 5726.

[10]. Qazi, Emad Ul Haq, Muhammad Hamza Faheem, and Tanveer Zia. "HDLNIDS: hybrid deep-learning-based network intrusion detection system." Applied Sciences 13.8 (2023): 4921.

[11]. Lilhore, Umesh Kumar, et al. "HIDM: Hybrid intrusion detection model for industry 4.0 Networks using an optimized CNN-LSTM with transfer learning." Sensors 23.18 (2023): 7856.

Cite this article

Li,Z. (2024). Leveraging Computational Algorithms for Effective Explicit and Tacit Knowledge Capture: A Hybrid Approach Combining Expert Interviews, Machine Learning, and Data Mining Techniques. Applied and Computational Engineering,114,40-45.

Data availability

The datasets used and/or analyzed during the current study will be available from the authors upon reasonable request.

Disclaimer/Publisher's Note

The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of EWA Publishing and/or the editor(s). EWA Publishing and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

About volume

Volume title: Proceedings of the 2nd International Conference on Machine Learning and Automation

ISBN：978-1-83558-781-2(Print) / 978-1-83558-782-9(Online)

Editor：Mustafa ISTANBULLU

Conference website: https://2024.confmla.org/

Conference date: 21 November 2024

Series: Applied and Computational Engineering

Volume number: Vol.114

ISSN：2755-2721(Print) / 2755-273X(Online)

© 2024 by the author(s). Licensee EWA Publishing, Oxford, UK. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license. Authors who publish this series agree to the following terms:
1. Authors retain copyright and grant the series right of first publication with the work simultaneously licensed under a Creative Commons Attribution License that allows others to share the work with an acknowledgment of the work's authorship and initial publication in this series.
2. Authors are able to enter into separate, additional contractual arrangements for the non-exclusive distribution of the series's published version of the work (e.g., post it to an institutional repository or publish it in a book), with an acknowledgment of its initial publication in this series.
3. Authors are permitted and encouraged to post their work online (e.g., in institutional repositories or on their website) prior to and during the submission process, as it can lead to productive exchanges, as well as earlier and greater citation of published work (See Open access policy for details).