To understand mechanistically how the protein fold is shaped by therapeutics to inform precision management of disease, we developed variation-capture (VarC) mapping. VarC triangulates sparse sequence variation information found in the population using Gaussian process regression (GPR)-based machine learning to define the combined pairwise-residue interactions contributing to dynamic protein function in the individual in response to therapeutics. Using VarC mapping, we now reveal the pairwise-residue covariant relationships across the entire protein fold of cystic fibrosis (CF) transmembrane conductance regulator (CFTR) to define the molecular mechanisms of clinically approved CF chemical modulators. We discover an energetically destabilized covariant core containing a di-acidic YKDAD endoplasmic reticulum (ER) exit code that is only weakly corrected by current therapeutics. Our results illustrate that VarC provides a generalizable tool to triangulate information from genetic variation in the population to mechanistically discover therapeutic strategies that guide precision management of the individual.
Keywords: Gaussian-process-regression-based machine learning; cystic fibrosis modulators; genetic variation; membrane trafficking; precision medicine; protein folding and structure; protein misfolding disease; protein thermodynamics; structural biology; therapeutics.
Copyright © 2022 Elsevier Ltd. All rights reserved.