Relative accuracy evaluation

PLoS One. 2014 Aug 18;9(8):e103853. doi: 10.1371/journal.pone.0103853. eCollection 2014.

Abstract

The quality of data plays an important role in business analysis and decision making, and data accuracy is an important aspect in data quality. Thus one necessary task for data quality management is to evaluate the accuracy of the data. And in order to solve the problem that the accuracy of the whole data set is low while a useful part may be high, it is also necessary to evaluate the accuracy of the query results, called relative accuracy. However, as far as we know, neither measure nor effective methods for the accuracy evaluation methods are proposed. Motivated by this, for relative accuracy evaluation, we propose a systematic method. We design a relative accuracy evaluation framework for relational databases based on a new metric to measure the accuracy using statistics. We apply the methods to evaluate the precision and recall of basic queries, which show the result's relative accuracy. We also propose the method to handle data update and to improve accuracy evaluation using functional dependencies. Extensive experimental results show the effectiveness and efficiency of our proposed framework and algorithms.

Publication types

  • Evaluation Study
  • Research Support, Non-U.S. Gov't

MeSH terms

  • Algorithms
  • Commerce
  • Data Interpretation, Statistical*
  • Humans
  • Models, Statistical
  • Research Design

Grants and funding

This paper was partially supported by NGFR 973 grant 2012CB316200 and NGFR 863 grant 2012AA011004. The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.