Background: Interpretation of variants of uncertain significance (VUSs) remains a challenge in the care of patients with inherited cardiovascular diseases (CVDs); 56% of variants within CVD risk genes are VUS, and machine learning algorithms trained upon large data resources can stratify VUS into higher versus lower probability of contributing to a CVD phenotype.
Methods: We used ClinVar pathogenic/likely pathogenic and benign/likely benign variants from 47 CVD genes to build a predictive model of variant pathogenicity utilizing measures of evolutionary constraint, deleteriousness, splicogenicity, local pathogenicity, cardiac-specific expression, and population allele frequency. Performance was validated using variants for which the ClinVar pathogenicity assignment changed. Functional validation was assessed using prior studies in >900 identified VUS. The model utility was demonstrated using the Catheterization Genetics (CATHGEN) cohort.
Results: We identified a top-ranked model that accurately prioritized variants for which ClinVar clinical significance had changed (n=663; precision-recall area under the curve, 0.97) and performed well compared with conventional in silico methods. This model (CVD pathogenicity predictor [CVD-PP]) also had high accuracy in prioritizing VUS with functional effects in vivo (precision-recall area under the curve, 0.58). In CATHGEN, there was a greater burden of higher CVD-PP scored VUS in individuals with dilated cardiomyopathy compared with controls (P=8.2×10-15). Of individuals in CATHGEN who harbored highly ranked CVD pathogenicity predictor VUS meeting clinical pathogenicity criteria, 27.6% had clinical evidence of disease. Variant prioritization using this model increased genetic diagnosis in CATHGEN participants with a known clinical diagnosis of hypertrophic cardiomyopathy (7.8%-27.2%).
Conclusions: We present a cardiac-specific model for prioritizing variants underlying CVD syndromes with high performance in discriminating the pathogenicity of VUS in CVD genes. Variant review and phenotyping of individuals carrying VUS of pathogenic interest support the clinical utility of this model. This model could also have utility in filtering variants as part of large-scale genomic sequencing studies.
Keywords: arrhythmias, cardiac; cardiovascular diseases; human genetics; machine learning; pathology, molecular.