Boosting any learning algorithm with Statistically Enhanced Learning

Sci Rep. 2025 Jan 10;15(1):1605. doi: 10.1038/s41598-024-84702-8.

Abstract

Feature engineering is of critical importance in the field of Data Science. While any data scientist knows the importance of rigorously preparing data to obtain good performing models, only scarce literature formalizes its benefits. In this work, we present the method of Statistically Enhanced Learning (SEL), a formalization framework of existing feature engineering and extraction tasks in Machine Learning (ML). Contrary to existing approaches, predictors are not directly observed but obtained as statistical estimators. Our goal is to study SEL, aiming to establish a formalized framework and illustrate its improved performance by means of simulations as well as applications on practical use cases.

Keywords: Feature extraction; Machine learning; Statistics.