Discovering hidden knowledge through auditing clinical diagnostic knowledge bases

J Biomed Inform. 2018 Aug:84:75-81. doi: 10.1016/j.jbi.2018.06.014. Epub 2018 Jun 22.

Abstract

Objective: Evaluate potential for data mining auditing techniques to identify hidden concepts in diagnostic knowledge bases (KB). Improving completeness enhances KB applications such as differential diagnosis and patient case simulation.

Materials and methods: Authors used unsupervised (Pearson's correlation - PC, Kendall's correlation - KC, and a heuristic algorithm - HA) methods to identify existing and discover new finding-finding interrelationships ("properties") in the INTERNIST-1/QMR KB. Authors estimated KB maintenance efficiency gains (effort reduction) of the approaches.

Results: The methods discovered new properties at 95% CI rates of [0.1%, 5.4%] (PC), [2.8%, 12.5%] (KC), and [5.6%, 18.8%] (HA). Estimated manual effort reduction for HA-assisted determination of new properties was approximately 50-fold.

Conclusion: Data mining can provide an efficient supplement to ensuring the completeness of finding-finding interdependencies in diagnostic knowledge bases. Authors' findings should be applicable to other diagnostic systems that record finding frequencies within diseases (e.g., DXplain, ISABEL).

Keywords: Data auditing; Internal medicine; Knowledge bases.

Publication types

  • Research Support, N.I.H., Extramural

MeSH terms

  • Algorithms
  • Bayes Theorem
  • Data Mining / methods*
  • Diagnosis, Computer-Assisted / methods*
  • Diagnosis, Differential
  • Expert Systems
  • Humans
  • Knowledge Bases*
  • Machine Learning
  • Medical Informatics / methods*
  • Models, Statistical
  • ROC Curve