Monitoring the quantity and quality of karst springs is essential for groundwater resource management. However, it is challenging to robustly forecast the karst spring discharge and pollutant concentration due to the high complexity and heterogeneity of karst aquifers. Few researchers have addressed the long-term prediction of hourly spring quantity and quality, which is crucial for emergency management. Here, we develop an ensemble model based on the long short-term memory (LSTM) and iTransformer models, with a random forest model as a meta-model to combine the base models. Experiments were conducted on hourly spring discharge and pollutant concentration predictions at the Xianrendong Spring, Guizhou, China, using a dataset comprising 2106 h of precipitation from four stations, spring discharge, and petroleum substances concentrations. The results indicate that the LSTM model can capture short-term dependencies but struggles with long-term variations, while the iTransformer can quickly apprehend complex patterns but tends to result in overfitting. By combining the strengths of LSTM and iTransformer, the ensemble model balances stability and sensitivity, reducing the bias and variance of individual models, and enhancing overall prediction accuracy. The ensemble model consistently outperforms both LSTM and iTransformer across all time steps (24, 36, 48, and 60 h) and longer lead times (6-10 h). The robust prediction with long lead times enables the ensemble model to effectively mitigate the hazard caused by petroleum substances leakage.
Keywords: Discharge; Ensemble; Karst spring; Long short-term memory (LSTM); Pollutant concentration; Transformer.
Copyright © 2025. Published by Elsevier Ltd.