On optimal settings of classification tree ensembles for medical decision support

Health Informatics J. 2013 Mar;19(1):3-15. doi: 10.1177/1460458212446096.

Abstract

Pattern recognition and machine learning methods provide an attractive approach for building decision support systems. Classification trees are frequently used algorithms for such tasks owing to their intuitive structure and effectiveness. It has been shown that for complex medical data, combining a number of base classifiers improves their overall accuracy. Classification tree ensembles have a certain number of free parameters to set, which can significantly affect their performance. In recent years such ensembles were often used by practitioners without a mathematical background (e.g. physicians), who may be unaware of how to obtain the optimal settings. Therefore, it is difficult for them to choose the satisfactory properties, while in most of the cases the default parameters proposed for them are not necessarily the most efficient. The aim of this article is to ascertain which types of combined tree classifiers give the best performance for medical decision support and which parameters should be chosen for them. A set of rules for end-users on how to tune their ensembles is proposed.

MeSH terms

  • Algorithms*
  • Artificial Intelligence
  • Benchmarking / methods*
  • Decision Support Systems, Clinical / standards*
  • Decision Trees*
  • Efficiency, Organizational*
  • Humans
  • Knowledge Bases
  • Pattern Recognition, Visual
  • User-Computer Interface