On the Use of Minimum Penalties in Statistical Learning

J Comput Graph Stat. 2024;33(1):138-151. doi: 10.1080/10618600.2023.2210174. Epub 2023 Jun 20.

Abstract

Modern multivariate machine learning and statistical methodologies estimate parameters of interest while leveraging prior knowledge of the association between outcome variables. The methods that do allow for estimation of relationships do so typically through an error covariance matrix in multivariate regression which does not generalize to other types of models. In this article we proposed the MinPen framework to simultaneously estimate regression coefficients associated with the multivariate regression model and the relationships between outcome variables using common assumptions. The MinPen framework utilizes a novel penalty based on the minimum function to simultaneously detect and exploit relationships between responses. An iterative algorithm is proposed as a solution to the non-convex optimization. Theoretical results such as high dimensional convergence rates, model selection consistency, and a framework for post selection inference are provided. We extend the proposed MinPen framework to other exponential family loss functions, with a specific focus on multiple binomial responses. Tuning parameter selection is also addressed. Finally, simulations and two data examples are presented to show the finite sample properties of this framework. Supplemental material providing proofs, additional simulations, code, and data sets are available online.

Keywords: Graph Constrained Models; High Dimensional Convergence; Non-Convex Optimization; Post-Selection Inference; Selection Consistency.