Background: Most methods for treating left-censored data assume the analyte is present but not quantified. Biased estimates may result if the analyte is absent such that the unobserved data represents a mixed exposure distribution with an unknown proportion clustered at zero.
Objective: We used semi-continuous models to identify time and industry trends in 52,457 OSHA inspection lead sample results.
Method: The first component of the semi-continuous model predicted the probability of detecting concentrations ≥ 0.007 mg/m3 (highest estimated detection limit, 62% of measurements). The second component predicted the median concentration of measurements ≥ 0.007 mg/m3. Both components included a random-effect for industry and fixed-effects for year, industry group, analytical method, and other variables. We used the two components together to predict median industry- and time-specific lead concentrations.
Results: The probabilities of detectable concentrations and the median detected concentrations decreased with year; both were also lower for measurements analyzed for multiple (vs. one) metals and for those analyzed by inductively-coupled plasma (vs. atomic absorption spectroscopy). The covariance was 0.30 (standard error = 0.06), confirming the two components were correlated.
Significance: We identified determinants of exposure in data with over 60% left-censored, while accounting for correlated relationships and without assuming a distribution for the censored data.
Keywords: left-censored data; occupational lead exposure; statistical modeling.
© 2021. This is a U.S. government work and not under copyright protection in the U.S.; foreign copyright protection may apply.