Background: It is unclear whether data-driven machine learning models, which are trained on large epidemiological cohorts, may improve prediction of comorbidities in people living with human immunodeficiency virus (HIV).
Methods: In this proof-of-concept study, we included people living with HIV in the prospective Swiss HIV Cohort Study with a first estimated glomerular filtration rate (eGFR) >60 mL/minute/1.73 m2 after 1 January 2002. Our primary outcome was chronic kidney disease (CKD)-defined as confirmed decrease in eGFR ≤60 mL/minute/1.73 m2 over 3 months apart. We split the cohort data into a training set (80%), validation set (10%), and test set (10%), stratified for CKD status and follow-up length.
Results: Of 12 761 eligible individuals (median baseline eGFR, 103 mL/minute/1.73 m2), 1192 (9%) developed a CKD after a median of 8 years. We used 64 static and 502 time-changing variables: Across prediction horizons and algorithms and in contrast to expert-based standard models, most machine learning models achieved state-of-the-art predictive performances with areas under the receiver operating characteristic curve and precision recall curve ranging from 0.926 to 0.996 and from 0.631 to 0.956, respectively.
Conclusions: In people living with HIV, we observed state-of-the-art performances in forecasting individual CKD onsets with different machine learning algorithms.
Keywords: HIV; chronic kidney disease; digital epidemiology; machine learning; prediction.
© The Author(s) 2020. Published by Oxford University Press for the Infectious Diseases Society of America.