Metastases have been widely thought to arise from rare, selected, mutation-bearing cells in the primary tumor. Recently, however, it has been proposed that breast tumors are imprinted ab initio with metastatic ability. Thus, there is a debate over whether 'phenotypic' disease progression is really associated with 'molecular' progression. We profiled 26 matched primary breast tumors and lymph node metastases and identified 270 probesets that could discriminate between the two categories. We then used an independent cohort of breast tumors (81 samples) and unmatched distant metastases (32 samples) to validate and refine this list down to a 126-probeset list. A representative subset of these genes was subjected to analysis by in situ hybridization, on a third independent cohort (57 primary breast tumors and matched lymph node metastases). This not only confirmed the expression profile data, but also allowed us to establish the cellular origin of the signals. One-third of the analysed representative genes (4 of 11) were expressed by the epithelial component. The four epithelial genes alone were able to discriminate primary breast tumors from their metastases. Finally, engineered alterations in the expression of two of the epithelial genes (SERPINB5 and LTF) modified cell motility in vitro, in accordance with a possible causal role in metastasis. Our results show that breast cancer metastases are molecularly distinct from their primary tumors.