Integrating Remote Sensing and Soil Features for Enhanced Machine Learning-Based Corn Yield Prediction in the Southern US

Sensors (Basel). 2025 Jan 18;25(2):543. doi: 10.3390/s25020543.

Abstract

Efficient and reliable corn (Zea mays L.) yield prediction is important for varietal selection by plant breeders and management decision-making by growers. Unlike prior studies that focus mainly on county-level or controlled laboratory-scale areas, this study targets a production-scale area, better representing real-world agricultural conditions and offering more practical relevance for farmers. Therefore, the objective of our study was to determine the best combination of vegetation indices and abiotic factors for predicting corn yield in a rain-fed, production-scale area, identify the most suitable corn growth stage for yield estimation using machine learning, and identify the most effective machine learning model for corn yield estimation. Our study used high-resolution (6 cm) aerial multispectral imagery. Sixty-two different predictors, including soil properties (sand, silt, and clay percentages), slope, spectral bands (red, green, blue, red-edge, NIR), vegetation indices (GNDRE, NDRE, TGI), color-space indices, and wavelengths were derived from the multispectral data collected at the seven (V4, V5, V6, V7, V9, V12, and V14/VT) growth stages of corn. Four regression and machine learning algorithms were evaluated for yield prediction: linear regression, random forest, extreme gradient boosting, and gradient boosting regressor. A total of 6865 yield values were used for model training and 1716 for validation. Results show that, using random forest method, the V14/VT stage had the best yield predictions (RMSE of 0.52 Mg/ha for a mean yield of 10.19 Mg/ha), and yield estimation at V6 stage was still feasible. We concluded that integrating abiotic factors, such as slope and soil properties, significantly improved model accuracy. Among vegetation indices, TGI, HUE, and GNDRE performed better. Results from this study can help farmers or crop consultants plan ahead for future logistics through enhanced early-season yield predictions and support farm profitability and sustainability.

Keywords: corn; ensemble methods; machine learning; maize; vegetation indices; yield prediction.

MeSH terms

  • Agriculture / methods
  • Algorithms
  • Machine Learning*
  • Remote Sensing Technology* / methods
  • Soil* / chemistry
  • Zea mays* / growth & development
  • Zea mays* / physiology

Substances

  • Soil