Reducing systematic review workload through certainty-based screening

J Biomed Inform. 2014 Oct:51:242-53. doi: 10.1016/j.jbi.2014.06.005. Epub 2014 Jun 19.

Abstract

In systematic reviews, the growing number of published studies imposes a significant screening workload on reviewers. Active learning is a promising approach to reduce the workload by automating some of the screening decisions, but it has been evaluated for a limited number of disciplines. The suitability of applying active learning to complex topics in disciplines such as social science has not been studied, and the selection of useful criteria and enhancements to address the data imbalance problem in systematic reviews remains an open problem. We applied active learning with two criteria (certainty and uncertainty) and several enhancements in both clinical medicine and social science (specifically, public health) areas, and compared the results in both. The results show that the certainty criterion is useful for finding relevant documents, and weighting positive instances is promising to overcome the data imbalance problem in both data sets. Latent dirichlet allocation (LDA) is also shown to be promising when little manually-assigned information is available. Active learning is effective in complex topics, although its efficiency is limited due to the difficulties in text classification. The most promising criterion and weighting method are the same regardless of the review topic, and unsupervised techniques like LDA have a possibility to boost the performance of active learning without manual annotation.

Keywords: Active learning; Certainty; Systematic reviews; Text mining.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Abstracting and Indexing* / methods
  • Algorithms*
  • Artificial Intelligence*
  • Databases, Bibliographic* / classification
  • Manuscripts as Topic
  • Natural Language Processing*
  • Peer Review, Research / methods
  • Semantics
  • Systematic Reviews as Topic*
  • Workload*