Ensemble variant interpretation methods to predict enzyme activity and assign pathogenicity in the CAGI4 NAGLU (Human N-acetyl-glucosaminidase) and UBE2I (Human SUMO-ligase) challenges

Hum Mutat. 2017 Sep;38(9):1109-1122. doi: 10.1002/humu.23267. Epub 2017 Jun 27.

Abstract

CAGI (Critical Assessment of Genome Interpretation) conducts community experiments to determine the state of the art in relating genotype to phenotype. Here, we report results obtained using newly developed ensemble methods to address two CAGI4 challenges: enzyme activity for population missense variants found in NAGLU (Human N-acetyl-glucosaminidase) and random missense mutations in Human UBE2I (Human SUMO E2 ligase), assayed in a high-throughput competitive yeast complementation procedure. The ensemble methods are effective, ranked second for SUMO-ligase and third for NAGLU, according to the CAGI independent assessors. However, in common with other methods used in CAGI, there are large discrepancies between predicted and experimental activities for a subset of variants. Analysis of the structural context provides some insight into these. Post-challenge analysis shows that the ensemble methods are also effective at assigning pathogenicity for the NAGLU variants. In the clinic, providing an estimate of the reliability of pathogenic assignments is the key. We have also used the NAGLU dataset to show that ensemble methods have considerable potential for this task, and are already reliable enough for use with a subset of mutations.

Keywords: CAGI; NAGLU; SUMO-ligase; ensemble methods; missense mutations; monogenic disease.

Publication types

  • Research Support, N.I.H., Extramural

MeSH terms

  • Acetylglucosaminidase / genetics*
  • Computational Biology / methods*
  • Databases, Genetic
  • Humans
  • Machine Learning
  • Mutation, Missense*
  • Phenotype
  • ROC Curve
  • Reproducibility of Results
  • Ubiquitin-Conjugating Enzymes / genetics*

Substances

  • Ubiquitin-Conjugating Enzymes
  • alpha-N-acetyl-D-glucosaminidase
  • Acetylglucosaminidase
  • ubiquitin-conjugating enzyme UBC9