Objective: The present study aimed to early identify patients with persistent somatic symptoms (PSS) in primary care by exploring routine care data-based approaches.
Design/setting: A cohort study based on routine primary care data from 76 general practices in the Netherlands was executed for predictive modelling.
Participants: Inclusion of 94 440 adult patients was based on: at least 7-year general practice enrolment, having more than one symptom/disease registration and >10 consultations.
Methods: Cases were selected based on the first PSS registration in 2017-2018. Candidate predictors were selected 2-5 years prior to PSS and categorised into data-driven approaches: symptoms/diseases, medications, referrals, sequential patterns and changing lab results; and theory-driven approaches: constructed factors based on literature and terminology in free text. Of these, 12 candidate predictor categories were formed and used to develop prediction models by cross-validated least absolute shrinkage and selection operator regression on 80% of the dataset. Derived models were internally validated on the remaining 20% of the dataset.
Results: All models had comparable predictive values (area under the receiver operating characteristic curves=0.70 to 0.72). Predictors are related to genital complaints, specific symptoms (eg, digestive, fatigue and mood), healthcare utilisation, and number of complaints. Most fruitful predictor categories are literature-based and medications. Predictors often had overlapping constructs, such as digestive symptoms (symptom/disease codes) and drugs for anti-constipation (medication codes), indicating that registration is inconsistent between general practitioners (GPs).
Conclusions: The findings indicate low to moderate diagnostic accuracy for early identification of PSS based on routine primary care data. Nonetheless, simple clinical decision rules based on structured symptom/disease or medication codes could possibly be an efficient way to support GPs in identifying patients at risk of PSS. A full data-based prediction currently appears to be hampered by inconsistent and missing registrations. Future research on predictive modelling of PSS using routine care data should focus on data enrichment or free-text mining to overcome inconsistent registrations and improve predictive accuracy.
Keywords: MENTAL HEALTH; PRIMARY CARE; STATISTICS & RESEARCH METHODS.
© Author(s) (or their employer(s)) 2023. Re-use permitted under CC BY-NC. No commercial re-use. See rights and permissions. Published by BMJ.