End Stage Renal Disease (ESRD) is a highly heterogeneous disease with significant differences in prevalence, mortality, complications, and treatment modalities across age, sex, race, and ethnicity. An improved knowledge of disease characteristics results from the use of a data-driven phenotypic classification strategy to identify patients of different subtypes and expose the clinical traits of different subtypes. This study used topic models and process mining techniques to perform subtyping of ESRD patients on hemodialysis based on real-world longitudinal electronic health record data. The mined subtypes are interpretable and clinically significant, and they can reflect differences in the progression of the disease state and clinical outcomes.
Keywords: disease subtype; end-stage renal disease; phenotype mining; process mining; topic model.