In mass spectrometry (MS) experiments, more than thousands of peaks are detected in the space of mass-to-charge ratio and chromatographic retention time, each associated with an abundance measurement. However, a large proportion of the peaks consists of experimental noise and low abundance compounds are typically masked by noise peaks, compromising the quality of the data. In this paper, we propose a new measure of similarity between a pair of MS experiments, called truncated rank correlation (TRC). To provide a robust metric of similarity in noisy high-dimensional data, TRC uses truncated top ranks (or top m-ranks) for calculating correlation. A comprehensive numerical study suggests that TRC outperforms traditional sample correlation and Kendall's τ. We apply TRC to measuring test-retest reliability of two MS experiments, including biological replicate analysis of the metabolome in HEK293 cells and metabolomic profiling of benign prostate hyperplasia (BPH) patients. An R package trc of the proposed TRC and related functions is available at https://sites.google.com/site/dhyeonyu/software.
Keywords: Kendall’s τ; mass spectrometry data; test-retest reliability; truncated rank.