With the increasing use of immune checkpoint inhibitors (ICIs), there is an urgent need to identify biomarkers to stratify responders and non-responders using programmed death-ligand (PD-L1) expression, and to predict patient-specific outcomes such as progression free survival (PFS). The current study is aimed to determine the feasibility of building imaging-based predictive biomarkers for PD-L1 and PFS through systematically evaluating a combination of several machine learning algorithms with different feature selection methods. A retrospective, multicenter study of 385 advanced NSCLC patients amenable to ICIs was undertaken in two academic centers. Radiomic features extracted from pretreatment CT scans were used to build predictive models for PD-L1 and PFS (short-term vs. long-term survivors). We first employed the LASSO methodology followed by five feature selection methods and seven machine learning approaches to build the predictors. From our analyses, we found several combinations of feature selection methods and machine learning algorithms to achieve a similar performance. Logistic regression with ReliefF feature selection (AUC = 0.64, 0.59 in discovery and validation cohorts) and SVM with Anova F-test feature selection (AUC = 0.64, 0.63 in discovery and validation datasets) were the best-performing models to predict PD-L1 and PFS. This study elucidates the application of suitable feature selection approaches and machine learning algorithms to predict clinical endpoints using radiomics features. Through this study, we identified a subset of algorithms that should be considered in future investigations for building robust and clinically relevant predictive models.
© 2023. The Author(s).