Novel statistical approaches to identify risk factors for soil-transmitted helminth infection in Timor-Leste

Int J Parasitol. 2021 Aug;51(9):729-739. doi: 10.1016/j.ijpara.2021.01.005. Epub 2021 Mar 31.

Abstract

Soil-transmitted helminths (STHs) are parasitic intestinal worms that infect almost a fifth of the global population. Sustainable control of STHs requires understanding the complex interaction of factors contributing to transmission. Identifying risk factors has mainly relied on logistic regression models where the underlying assumption of independence between variables is not always satisfied. Previously demonstrated risk factors including water, sanitation and hygiene (WASH) access and behaviours, and socioeconomic status are intrinsically linked. Similarly, environmental factors including climate, soil and land attributes are often strongly correlated. Alternative methods such as recursive partitioning and Bayesian networks can handle correlated variables, but there are no published studies comparing these methods with logistic regression in the context of STH risk factor analysis. Baseline cross-sectional data from school-aged children in the (S)WASH-D for Worms study were used to compare risk factors identified from modelling the same data using three different statistical techniques. Outcomes of interest were infection with Ascaris spp. and any hookworm species (Necator americanus, Ancylostoma duodenale, and Ancylostoma ceylanicum). Mixed-effects logistic regression identified the fewest risk factors. Recursive partitioning identified the most WASH and demographic risk factors, while Bayesian networks identified the most environmental risk factors. Recursive partitioning produced classification trees that visualised potentially at-risk population sub-groups. Bayesian networks helped visualise relationships between variables and enabled interactive modelling of outcomes based on different scenarios for the predictor variables of interest. Model performance was similar across all techniques. Risk factors identified across all techniques were vegetation for Ascaris spp., and cleaning oneself with water after defecating for hookworm. This study adds to the limited body of evidence exploring alternative data modelling approaches in identifying risk factors for STH infections. Our findings suggest these approaches can provide novel insights for more robust interpretation.

Keywords: Bayesian networks; Logistic regression; Recursive partitioning; Risk factors; Sanitation and hygiene; Soil-transmitted helminths; Water.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Animals
  • Bayes Theorem
  • Cross-Sectional Studies
  • Feces
  • Helminthiasis* / epidemiology
  • Necator americanus
  • Prevalence
  • Risk Factors
  • Sanitation
  • Soil*
  • Timor-Leste / epidemiology

Substances

  • Soil