Purpose: Score reproducibility is an important measurement property of fit-for-purpose patient-reported outcome (PRO) measures. It is commonly assessed via test-retest reliability, and best evaluated with a stable participant sample, which can be challenging to identify in diseases with highly variable symptoms. To provide empirical evidence comparing the retrospective (patient global impression of change [PGIC]) and current state (patient global impression of severity [PGIS]) approaches to identifying a stable subgroup for test-retest analyses, 3 PRO Consortium working groups collected data using both items as anchor measures.
Methods: The PGIS was completed on Day 1 and Day 8 + 3 for the depression and non-small cell lung cancer (NSCLC) studies, and daily for the asthma study and compared between Day 3 and 10. The PGIC was completed on the final day in each study. Scores were compared using an intraclass correlation coefficient (ICC) for participants who reported "no change" between timepoints for each anchor.
Results: ICCs using the PGIS "no change" group were higher for depression (0.84 vs. 0.74), nighttime asthma (0.95 vs. 0.53) and daytime asthma (0.86 vs. 0.68) compared to the PGIC "no change" group. ICCs were similar for NSCLC (PGIS: 0.87; PGIC: 0.85).
Conclusion: When considering anchor measures to identify a stable subgroup for test-retest reliability analyses, current state anchors perform better than retrospective anchors. Researchers should carefully consider the type of anchor selected, the time period covered, and should ensure anchor content is consistent with the target measure concept, as well as inclusion of both current and retrospective anchor measures.
Keywords: Clinical outcome assessment; PGIC; PGIS; Patient-reported outcome measure; Test–retest reliability.
© 2022. The Author(s).