A comparison of different population-level summary measures for randomised trials with time-to-event outcomes, with a focus on non-inferiority trials

Clin Trials. 2023 Dec;20(6):594-602. doi: 10.1177/17407745231181907. Epub 2023 Jun 20.

Abstract

Background: The population-level summary measure is a key component of the estimand for clinical trials with time-to-event outcomes. This is particularly the case for non-inferiority trials, because different summary measures imply different null hypotheses. Most trials are designed using the hazard ratio as summary measure, but recent studies suggested that the difference in restricted mean survival time might be more powerful, at least in certain situations. In a recent letter, we conjectured that differences between summary measures can be explained using the concept of the non-inferiority frontier and that for a fair simulation comparison of summary measures, the same analysis methods, making the same assumptions, should be used to estimate different summary measures. The aim of this article is to make such a comparison between three commonly used summary measures: hazard ratio, difference in restricted mean survival time and difference in survival at a fixed time point. In addition, we aim to investigate the impact of using an analysis method that assumes proportional hazards on the operating characteristics of a trial designed with any of the three summary measures.

Methods: We conduct a simulation study in the proportional hazards setting. We estimate difference in restricted mean survival time and difference in survival non-parametrically, without assuming proportional hazards. We also estimate all three measures parametrically, using flexible survival regression, under the proportional hazards assumption.

Results: Comparing the hazard ratio assuming proportional hazards with the other summary measures not assuming proportional hazards, relative performance varies substantially depending on the specific scenario. Fixing the summary measure, assuming proportional hazards always leads to substantial power gains compared to using non-parametric methods. Fixing the modelling approach to flexible parametric regression assuming proportional hazards, difference in restricted mean survival time is most often the most powerful summary measure among those considered.

Conclusion: When the hazards are likely to be approximately proportional, reflecting this in the analysis can lead to large gains in power for difference in restricted mean survival time and difference in survival. The choice of summary measure for a non-inferiority trial with time-to-event outcomes should be made on clinical grounds; when any of the three summary measures discussed here is equally justifiable, difference in restricted mean survival time is most often associated with the most powerful test, on the condition that it is estimated under proportional hazards.

Keywords: Non-inferiority; difference in survival; estimands; hazard ratio; population-level summary measures; restricted mean survival time.

Publication types

  • Equivalence Trial
  • Randomized Controlled Trial
  • Research Support, Non-U.S. Gov't

MeSH terms

  • Computer Simulation
  • Humans
  • Proportional Hazards Models
  • Research Design*
  • Sample Size
  • Survival Analysis
  • Time Factors