Sensitive and specific peak detection for SELDI-TOF mass spectrometry using a wavelet/neural-network based approach

PLoS One. 2012;7(11):e48103. doi: 10.1371/journal.pone.0048103. Epub 2012 Nov 12.

Abstract

SELDI-TOF mass spectrometer's compact size and automated, high throughput design have been attractive to clinical researchers, and the platform has seen steady-use in biomarker studies. Despite new algorithms and preprocessing pipelines that have been developed to address reproducibility issues, visual inspection of the results of SELDI spectra preprocessing by the best algorithms still shows miscalled peaks and systematic sources of error. This suggests that there continues to be problems with SELDI preprocessing. In this work, we study the preprocessing of SELDI in detail and introduce improvements. While many algorithms, including the vendor supplied software, can identify peak clusters of specific mass (or m/z) in groups of spectra with high specificity and low false discover rate (FDR), the algorithms tend to underperform estimating the exact prevalence and intensity of peaks in those clusters. Thus group differences that at first appear very strong are shown, after careful and laborious hand inspection of the spectra, to be less than significant. Here we introduce a wavelet/neural network based algorithm which mimics what a team of expert, human users would call for peaks in each of several hundred spectra in a typical SELDI clinical study. The wavelet denoising part of the algorithm optimally smoothes the signal in each spectrum according to an improved suite of signal processing algorithms previously reported (the LibSELDI toolbox under development). The neural network part of the algorithm combines those results with the raw signal and a training dataset of expertly called peaks, to call peaks in a test set of spectra with approximately 95% accuracy. The new method was applied to data collected from a study of cervical mucus for the early detection of cervical cancer in HPV infected women. The method shows promise in addressing the ongoing SELDI reproducibility issues.

Publication types

  • Research Support, N.I.H., Extramural
  • Research Support, Non-U.S. Gov't
  • Research Support, U.S. Gov't, Non-P.H.S.
  • Research Support, U.S. Gov't, P.H.S.

MeSH terms

  • Adolescent
  • Adult
  • Aged
  • Algorithms
  • Biomarkers / metabolism
  • Carcinoma in Situ / diagnosis
  • Carcinoma in Situ / epidemiology
  • Cervix Mucus / chemistry
  • Female
  • Humans
  • Middle Aged
  • Neural Networks, Computer*
  • Prevalence
  • Quality Control
  • Reproducibility of Results
  • Sensitivity and Specificity
  • Software
  • Spectrometry, Mass, Matrix-Assisted Laser Desorption-Ionization / methods*
  • Uterine Cervical Neoplasms / diagnosis
  • Uterine Cervical Neoplasms / epidemiology
  • Young Adult

Substances

  • Biomarkers