Identifying error-making patterns in assessment of mammographic BI-RADS descriptors among radiology residents using statistical pattern recognition

Maciej A Mazurowski; Huiman X Barnhart; Jay A Baker; Georgia D Tourassi

doi:10.1016/j.acra.2012.01.012

Identifying error-making patterns in assessment of mammographic BI-RADS descriptors among radiology residents using statistical pattern recognition

Acad Radiol. 2012 Jul;19(7):865-71. doi: 10.1016/j.acra.2012.01.012. Epub 2012 Mar 27.

Authors

Maciej A Mazurowski¹, Huiman X Barnhart, Jay A Baker, Georgia D Tourassi

Affiliation

¹ Department of Radiology, Box 3808, Duke University Medical Center, Durham, NC 27710, USA. maciej.mazurowski@duke.edu

PMID: 22459643
DOI: 10.1016/j.acra.2012.01.012

Abstract

Rationale and objective: The objective of this study is to test the hypothesis that there are patterns in erroneous assessment of BI-RADS features among radiology trainees when interpreting mammographic masses and that these patterns can be captured in individualized statistical user models. Identifying these patterns could be useful in personalizing and adapting educational material to complement the individual weaknesses of each trainee during his or her mammography education.

Materials and methods: Reading data of 33 mammographic cases containing masses was used. The cases were individually described by 10 radiology residents using four BI-RADS features: mass shape, mass margin, mass density and parenchyma density. For each resident, an individual model was automatically constructed that predicts likelihood (HIGH or LOW) of erroneously assigning each BI-RADS descriptor by the resident. Error was defined as deviation of the resident's assessment from the expert assessments. We evaluated the predictive performance of the models using leave-one-out crossvalidation.

Results: The user models were able to predict which assessments have higher likelihood of error. The proportion of actual errors to the number of situations in which these errors could potentially occur was significantly higher (P < .05) when user-model assigned HIGH likelihood of error than when LOW likelihood of error was assigned for three of the four BI-RADS features. Overall, the difference between the HIGH and LOW likelihood of error groups was statistically significant (P < .0001) combining all four features.

Conclusion: Error making in BI-RADS descriptor assessment appears to follow patterns that can be captured with statistical pattern recognition-based user models.

MeSH terms

Breast Neoplasms / diagnostic imaging*
Diagnostic Errors*
Female
Humans
Internship and Residency*
Mammography*
Models, Statistical
Radiology / education*