Performance of in silico tools for the evaluation of p16INK4a (CDKN2A) variants in CAGI

Hum Mutat. 2017 Sep;38(9):1042-1050. doi: 10.1002/humu.23235. Epub 2017 May 16.

Abstract

Correct phenotypic interpretation of variants of unknown significance for cancer-associated genes is a diagnostic challenge as genetic screenings gain in popularity in the next-generation sequencing era. The Critical Assessment of Genome Interpretation (CAGI) experiment aims to test and define the state of the art of genotype-phenotype interpretation. Here, we present the assessment of the CAGI p16INK4a challenge. Participants were asked to predict the effect on cellular proliferation of 10 variants for the p16INK4a tumor suppressor, a cyclin-dependent kinase inhibitor encoded by the CDKN2A gene. Twenty-two pathogenicity predictors were assessed with a variety of accuracy measures for reliability in a medical context. Different assessment measures were combined in an overall ranking to provide more robust results. The R scripts used for assessment are publicly available from a GitHub repository for future use in similar assessment exercises. Despite a limited test-set size, our findings show a variety of results, with some methods performing significantly better. Methods combining different strategies frequently outperform simpler approaches. The best predictor, Yang&Zhou lab, uses a machine learning method combining an empirical energy function measuring protein stability with an evolutionary conservation term. The p16INK4a challenge highlights how subtle structural effects can neutralize otherwise deleterious variants.

Keywords: CAGI experiment; bioinformatics tools; cancer; pathogenicity predictors; variant interpretation.

MeSH terms

  • Cell Line, Tumor
  • Cell Proliferation
  • Computational Biology / methods*
  • Computer Simulation
  • Cyclin-Dependent Kinase Inhibitor p16
  • Cyclin-Dependent Kinase Inhibitor p18 / chemistry
  • Cyclin-Dependent Kinase Inhibitor p18 / genetics*
  • Databases, Genetic
  • Genetic Predisposition to Disease
  • Genetic Variation*
  • Humans
  • Machine Learning
  • Protein Stability

Substances

  • CDKN2A protein, human
  • Cyclin-Dependent Kinase Inhibitor p16
  • Cyclin-Dependent Kinase Inhibitor p18