The goal of this study was to demonstrate that information theory could be used to prioritize mammographic features to efficiently stratify the risk of breast cancer. We compared two approaches, Single-dimensional Mutual Information (SMI), which ranks features based on mutual information of features with outcomes without considering dependency of other features, and Multidimensional Mutual Information (MMI), which ranks features by considering dependency. To evaluate these approaches, we calculated area under the ROC curve for Bayesian networks trained and tested on features ranked by each approach. We found that both approaches were able to stratify mammograms by risk, but MMI required fewer features (ten vs. thirteen). MMI-based rankings may have greater clinical utility; a smaller set of features allows radiologists to focus on those findings with the highest yield and in the future may help improve mammography workflow.