Semiparametric Estimation of the Distribution of Episodically Consumed Foods Measured With Error

J Am Stat Assoc. 2022;117(537):469-481. doi: 10.1080/01621459.2020.1787840. Epub 2020 Aug 19.

Abstract

Dietary data collected from 24-hour dietary recalls are observed with significant measurement errors. In the nonparametric curve estimation literature, much of the effort has been devoted to designing methods that are consistent under contamination by noise, and which have been traditionally applied for analyzing those data. However, some foods such as alcohol or fruits are consumed only episodically, and may not be consumed during the day when the 24-hour recall is administered. These so-called excess zeros make existing nonparametric estimators break down, and new techniques need to be developed for such data. We develop two new consistent semiparametric estimators of the distribution of such episodically consumed food data, making parametric assumptions only on some less important parts of the model. We establish its theoretical properties and illustrate the good performance of our fully data-driven method in simulated and real data. Supplementary materials for this article are available online.

Keywords: Asymptotic theory; Deconvolution; Excess zeros; Measurement error; Nonparametric deconvolution.