Background: Lung cancer screening guidelines recommend using individualized risk models to refer ever-smokers for screening. However, different models select different screening populations. The performance of each model in selecting ever-smokers for screening is unknown.
Objective: To compare the U.S. screening populations selected by 9 lung cancer risk models (the Bach model; the Spitz model; the Liverpool Lung Project [LLP] model; the LLP Incidence Risk Model [LLPi]; the Hoggart model; the Prostate, Lung, Colorectal, and Ovarian Cancer Screening Trial Model 2012 [PLCOM2012]; the Pittsburgh Predictor; the Lung Cancer Risk Assessment Tool [LCRAT]; and the Lung Cancer Death Risk Assessment Tool [LCDRAT]) and to examine their predictive performance in 2 cohorts.
Design: Population-based prospective studies.
Setting: United States.
Participants: Models selected U.S. screening populations by using data from the National Health Interview Survey from 2010 to 2012. Model performance was evaluated using data from 337 388 ever-smokers in the National Institutes of Health-AARP Diet and Health Study and 72 338 ever-smokers in the CPS-II (Cancer Prevention Study II) Nutrition Survey cohort.
Measurements: Model calibration (ratio of model-predicted to observed cases [expected-observed ratio]) and discrimination (area under the curve [AUC]).
Results: At a 5-year risk threshold of 2.0%, the models chose U.S. screening populations ranging from 7.6 million to 26 million ever-smokers. These disagreements occurred because, in both validation cohorts, 4 models (the Bach model, PLCOM2012, LCRAT, and LCDRAT) were well-calibrated (expected-observed ratio range, 0.92 to 1.12) and had higher AUCs (range, 0.75 to 0.79) than 5 models that generally overestimated risk (expected-observed ratio range, 0.83 to 3.69) and had lower AUCs (range, 0.62 to 0.75). The 4 best-performing models also had the highest sensitivity at a fixed specificity (and vice versa) and similar discrimination at a fixed risk threshold. These models showed better agreement on size of the screening population (7.6 million to 10.9 million) and achieved consensus on 73% of persons chosen.
Limitation: No consensus on risk thresholds for screening.
Conclusion: The 9 lung cancer risk models chose widely differing U.S. screening populations. However, 4 models (the Bach model, PLCOM2012, LCRAT, and LCDRAT) most accurately predicted risk and performed best in selecting ever-smokers for screening.
Primary funding source: Intramural Research Program of the National Institutes of Health/National Cancer Institute.