When comparing reconstruction algorithms, differences in figures of performance merit that are too small to be of any practical relevance may still be statistically significant. We formalize the notion of "relevance" and propose an evaluation methodology in which statistical significance is retained for relevant improvements, but not for irrelevant ones.