Modeling lake conductivity in the contiguous United States using spatial indexing for big spatial data

Michael Dumelle; Jay M Ver Hoef; Amalia Handler; Ryan A Hill; Matt Higham; Anthony R Olsen

doi:10.1016/j.spasta.2023.100808

Modeling lake conductivity in the contiguous United States using spatial indexing for big spatial data

Spat Stat. 2024 Mar:59:100808. doi: 10.1016/j.spasta.2023.100808.

Authors

Michael Dumelle¹, Jay M Ver Hoef², Amalia Handler¹, Ryan A Hill¹, Matt Higham³, Anthony R Olsen¹

Affiliations

¹ United States Environmental Protection Agency, 200 SW 35th St, Corvallis, OR, USA.
² Marine Mammal Laboratory, Alaska Fisheries Science Center, NOAA Fisheries, Seattle, WA, USA.
³ St. Lawrence University Department of Mathematics, Computer Science, and Statistics, Canton, NY, USA.

PMID: 39758934
PMCID: PMC11694821 (available on 2025-03-01)
DOI: 10.1016/j.spasta.2023.100808

Abstract

Conductivity is an important indicator of the health of aquatic ecosystems. We model large amounts of lake conductivity data collected as part of the United States Environmental Protection Agency's National Lakes Assessment using spatial indexing, a flexible and efficient approach to fitting spatial statistical models to big data sets. Spatial indexing is capable of accommodating various spatial covariance structures as well as features like random effects, geometric anisotropy, partition factors, and non-Euclidean topologies. We use spatial indexing to compare lake conductivity models and show that calcium oxide rock content, crop production, human development, precipitation, and temperature are strongly related to lake conductivity. We use this model to predict lake conductivity at hundreds of thousands of lakes distributed throughout the contiguous United States. We find that lake conductivity models fit using spatial indexing are nearly identical to lake conductivity models fit using traditional methods but are nearly 50 times faster (sample size 3,311). Spatial indexing is readily available in the spmodel $R$ package.

Keywords: Model selection; Prediction (Kriging); Restricted maximum likelihood estimation; Salinization; Spatial correlation; United States National Aquatic Resource Surveys.

Grants and funding

EPA999999/ImEPA/Intramural EPA/United States