It is well known that medical coding practice is inconsistent and that differences in usage may exist even at the institutional level. In this paper we introduce a novel method for investigating code usage patterns in clinical documentation corpora. By applying the Compression-based Dissimilarity Measure to calculate similarities between encounter notes, we find that certain notes can be associated with a number of different classifications and that a given classification code can be documented in fundamentally different ways. The effect is that some notes need to be understood in the context of the classification code, a finding which has implications for data mining or information extraction tasks. In addition, the method opens for a number of interesting application areas that include highlighting code use anomalies, measuring how coding practice changes over time, comparing code usage across institutions, and, perhaps most importantly, provide valuable feedback to developers of classification coding systems.