Exploring the Impact of Deleting (or Retaining) a Biased Item: A Procedure Based on Classification Accuracy

Meltem Ozcan; Mark H C Lai

doi:10.1177/10731911241298081

Exploring the Impact of Deleting (or Retaining) a Biased Item: A Procedure Based on Classification Accuracy

Assessment. 2024 Dec 10:10731911241298081. doi: 10.1177/10731911241298081. Online ahead of print.

Authors

Meltem Ozcan¹, Mark H C Lai¹

Affiliation

¹ University of Southern California, Los Angeles, USA.

PMID: 39655755
DOI: 10.1177/10731911241298081

Abstract

Psychological test scores are commonly used in high-stakes settings to classify individuals. While measurement invariance across groups is necessary for valid and meaningful inferences of group differences, full measurement invariance rarely holds in practice. The classification accuracy analysis framework aims to quantify the degree and practical impact of noninvariance. However, how to best navigate the next steps remains unclear, and methods devised to account for noninvariance at the group level may be insufficient when the goal is classification. Furthermore, deleting a biased item may improve fairness but negatively affect performance, and replacing the test can be costly. We propose item-level effect size indices that allow test users to make more informed decisions by quantifying the impact of deleting (or retaining) an item on test performance and fairness, provide an illustrative example, and introduce unbiasr, an R package implementing the proposed methods.

Keywords: R package; classification accuracy; fairness; item bias; measurement invariance.