Many clinical trials involve partially clustered data, where some observations belong to a cluster and others can be considered independent. For example, neonatal trials may include infants from single or multiple births. Sample size and analysis methods for these trials have received limited attention. A simulation study was conducted to (1) assess whether existing power formulas based on generalized estimating equations (GEEs) provide an adequate approximation to the power achieved by mixed effects models, and (2) compare the performance of mixed models vs GEEs in estimating the effect of treatment on a continuous outcome. We considered clusters that exist prior to randomization with a maximum cluster size of 2, three methods of randomizing the clustered observations, and simulated datasets with uninformative cluster size and the sample size required to achieve 80% power according to GEE-based formulas with an independence or exchangeable working correlation structure. The empirical power of the mixed model approach was close to the nominal level when sample size was calculated using the exchangeable GEE formula, but was often too high when the sample size was based on the independence GEE formula. The independence GEE always converged and performed well in all scenarios. Performance of the exchangeable GEE and mixed model was also acceptable under cluster randomization, though under-coverage and inflated type I error rates could occur with other methods of randomization. Analysis of partially clustered trials using GEEs with an independence working correlation structure may be preferred to avoid the limitations of mixed models and exchangeable GEEs.
Keywords: clinical trials; generalized estimating equations; mixed effects models; partial clustering; power; simulation study.
© 2024 The Author(s). Statistics in Medicine published by John Wiley & Sons Ltd.