Hospital-specific outcome measures based on routine data are useful for stimulating interest in quality of care and for suggesting avenues for more in-depth analyses. They might also identify serious, once-in-a-lifetime failures of health care. However, such analyses are not definitive. They are a way of screening large amounts of routine data and, like all screening tools, they can generate false positives and false negatives. This is because differences in outcome measures across hospitals can be due to differences in types of patients seen (casemix), differences in data quality, and the play of chance; rather than differences in the quality of care. End-users of such analyses should be aware of these technical difficulties, otherwise skilled health workers in high-quality hospitals might be subjected to unwarranted criticism.