Machine learning techniques for the prediction of indoor gamma-ray dose rates - Strengths, weaknesses and implications for epidemiology

J Environ Radioact. 2024 Dec 27:282:107595. doi: 10.1016/j.jenvrad.2024.107595. Online ahead of print.

Abstract

We investigate methods that improve the estimation of indoor gamma ray dose rates at locations where measurements had not been made. These new predictions use a greater range of modelling techniques and larger variety of explanatory variables than our previous examinations of this subject. Specifically, we now employ three types of machine learning models in addition to the geostatistical, nearest neighbour and other earlier models. A large number of parameters, mostly describing the characteristics of dwellings in the area in question, have been added to the set of explanatory variables. The use of machine learning methods results in significantly improved predictions over earlier models. The machine learning models are noisy and there is some instability in the relative importance of particular explanatory variables although there are general and consistent tendencies supporting the importance of certain classes of variable. However, the range of predicted indoor gamma ray dose rates is much smaller than that of the measurements. It is probable that epidemiological studies using such predictions will have lower statistical power than those based on direct measurements.

Keywords: Gamma radiation; Geostatistcs; Machine learning; Natural background; Neural networks; Random forests.