Using information criteria to select smoothing parameters when analyzing survival data with time-varying coefficient hazard models

Stat Methods Med Res. 2023 Sep;32(9):1664-1679. doi: 10.1177/09622802231181471. Epub 2023 Jul 5.

Abstract

Analyzing the large-scale survival data from the National Cancer Institute's Surveillance, Epidemiology, and End Results (SEER) Program may help guide the management of cancer. Detecting and characterizing the time-varying effects of factors collected at the time of diagnosis could reveal important and useful patterns. However, fitting a time-varying effect model by maximizing the partial likelihood with such large-scale survival data is not feasible with most existing software. Moreover, estimating time-varying coefficients using spline based approaches requires a moderate number of knots, which may lead to unstable estimation and over-fitting issues. To resolve these issues, adding a penalty term greatly aids estimation. The selection of penalty smoothing parameters is difficult in this time-varying setting, as traditional ways like using Akaike information criterion do not work, while cross-validation methods have a heavy computational burden, leading to unstable selections. We propose modified information criteria to determine the smoothing parameter and a parallelized Newton-based algorithm for estimation. We conduct simulations to evaluate the performance of the proposed method. We find that penalization with the smoothing parameter chosen by a modified information criteria is effective at reducing the mean squared error of the estimated time-varying coefficients. Compared to a number of alternatives, we find that the estimates of the variance derived from Bayesian considerations have the best coverage rates of confidence intervals. We apply the method to SEER head-and-neck, colon, prostate, and pancreatic cancer data and detect the time-varying nature of various risk factors.

Keywords: Cancer study; information criteria; penalization; survival analysis; time-varying effects.

Publication types

  • Research Support, N.I.H., Extramural

MeSH terms

  • Bayes Theorem
  • Humans
  • Male
  • Models, Statistical*
  • Pancreatic Neoplasms*
  • Proportional Hazards Models
  • Risk Factors