When a ranking of institutions such as medical centers or universities is based on a numerical measure of performance provided with a standard error, confidence intervals (CIs) should be calculated to assess the uncertainty of these ranks. We present a novel method based on Tukey's honest significant difference test to construct simultaneous CIs for the true ranks. When all the true performances are equal, the probability of coverage of our method attains the nominal level. In case the true performance measures have no exact ties, our method is conservative. For this situation, we propose a rescaling method to the nominal level that results in shorter CIs while keeping control of the simultaneous coverage. We also show that a similar rescaling can be applied to correct a recently proposed Monte-Carlo based method, which is anticonservative. After rescaling, the two methods perform very similarly. However, the rescaling of the Monte-Carlo based method is computationally much more demanding and becomes infeasible when the number of institutions is larger than 30-50. We discuss another recently proposed method similar to ours based on simultaneous CIs for the true performance. We show that our method provides uniformly shorter CIs for the same confidence level. We illustrate the superiority of our new methods with a data analysis for travel time to work in the United States and on rankings of 64 hospitals in the Netherlands.
Keywords: Monte-Carlo; Tukey's HSD; league tables; multiple comparisons; rankability.
© 2020 The International Biometric Society.