A comparative study of rank aggregation methods for partial and top ranked lists in genomic applications

Brief Bioinform. 2019 Jan 18;20(1):178-189. doi: 10.1093/bib/bbx101.

Abstract

Rank aggregation (RA), the process of combining multiple ranked lists into a single ranking, has played an important role in integrating information from individual genomic studies that address the same biological question. In previous research, attention has been focused on aggregating full lists. However, partial and/or top ranked lists are prevalent because of the great heterogeneity of genomic studies and limited resources for follow-up investigation. To be able to handle such lists, some ad hoc adjustments have been suggested in the past, but how RA methods perform on them (after the adjustments) has never been fully evaluated. In this article, a systematic framework is proposed to define different situations that may occur based on the nature of individually ranked lists. A comprehensive simulation study is conducted to examine the performance characteristics of a collection of existing RA methods that are suitable for genomic applications under various settings simulated to mimic practical situations. A non-small cell lung cancer data example is provided for further comparison. Based on our numerical results, general guidelines about which methods perform the best/worst, and under what conditions, are provided. Also, we discuss key factors that substantially affect the performance of the different methods.

Publication types

  • Comparative Study
  • Evaluation Study
  • Research Support, N.I.H., Extramural

MeSH terms

  • Bayes Theorem
  • Carcinoma, Non-Small-Cell Lung / genetics
  • Computational Biology / methods*
  • Computer Simulation
  • Data Interpretation, Statistical
  • Databases, Genetic / statistics & numerical data
  • Genomics / statistics & numerical data*
  • Humans
  • Lung Neoplasms / genetics
  • Markov Chains
  • Models, Statistical
  • Software