Efficient multiple change point detection for high-dimensional generalized linear models

Can J Stat. 2023 Jun;51(2):596-629. doi: 10.1002/cjs.11721. Epub 2022 Sep 16.

Abstract

Change point detection for high-dimensional data is an important yet challenging problem for many applications. In this paper, we consider multiple change point detection in the context of high-dimensional generalized linear models, allowing the covariate dimension p to grow exponentially with the sample size n. The model considered is general and flexible in the sense that it covers various specific models as special cases. It can automatically account for the underlying data generation mechanism without specifying any prior knowledge about the number of change points. Based on dynamic programming and binary segmentation techniques, two algorithms are proposed to detect multiple change points, allowing the number of change points to grow with n. To further improve the computational efficiency, a more efficient algorithm designed for the case of a single change point is proposed. We present theoretical properties of our proposed algorithms, including estimation consistency for the number and locations of change points as well as consistency and asymptotic distributions for the underlying regression coefficients. Finally, extensive simulation studies and application to the Alzheimer's Disease Neuroimaging Initiative data further demonstrate the competitive performance of our proposed methods.

Keywords: Binary segmentation; Dynamic programming; Generalized linear models; High dimensions.