The reliability of medical group performance measurement in a single insurer's pay for performance program

Hector P Rodriguez; Lisa Perry; Douglas A Conrad; Charles Maynard; Diane P Martin; David E Grembowski

doi:10.1097/MLR.0b013e31822dcddb

The reliability of medical group performance measurement in a single insurer's pay for performance program

Med Care. 2012 Feb;50(2):117-23. doi: 10.1097/MLR.0b013e31822dcddb.

Authors

Hector P Rodriguez¹, Lisa Perry, Douglas A Conrad, Charles Maynard, Diane P Martin, David E Grembowski

Affiliation

¹ Department of Health Services, School of Public Health, University of California, Los Angeles, Los Angeles, CA 90095-1772, USA. hrod@ucla.edu

PMID: 21993058
DOI: 10.1097/MLR.0b013e31822dcddb

Abstract

Background: Most public reporting and pay for performance (P4P) programs in the United States continue to be organized and implemented by single insurers. Adequate medical group-level reliability on clinical care process measures is possible in multistakeholder initiatives because patient samples can be pooled across payers. However, the extent to which reliable measurement is achievable in single insurer P4P initiatives remains unclear.

Methods: This study uses 7 years (2001 to 2007) of patient-level clinical care process data from an insurer in Washington State involving 20 medical groups. Eight clinical care process measures were analyzed. We compared the medical group-level reliability and resulting sample size requirements for each of the 8 measures using unadjusted and adjusted binary mixed models. The relation of baseline intraclass correlation coefficients (ICCs) and medical group performance change over time was examined for each clinical care process measure.

Results: Only 45% of all medical group measurements (group-years for all observations) had sufficient sample sizes to achieve reliable estimates of group performance. Measures with the largest deficiencies in patient samples per group included appropriate asthma treatment and low-density lipoprotein screening for patients with coronary artery disease. There was an inconsistent relationship between the size of baseline ICCs and medical group performance improvement over time.

Conclusions: Unreliable performance measurement is an important consequence of the prevailing organization and implementation of public reporting and P4P programs in the US. Multi-payer collaborations may be an important vehicle for ensuring reliable medical group performance measurement and comparisons on clinical care process measures.

Publication types

Research Support, Non-U.S. Gov't

MeSH terms

Asthma / therapy
Coronary Artery Disease / blood
Glycated Hemoglobin / analysis
Humans
Insurance Carriers / standards
Lipoproteins, LDL / blood
Quality Indicators, Health Care / standards*
Reimbursement, Incentive / organization & administration
Reimbursement, Incentive / standards*
Reproducibility of Results
Sample Size
Time Factors
Washington

Substances

Glycated Hemoglobin A
Lipoproteins, LDL
hemoglobin A1c protein, human