Two-stage stratified sampling is a complex design that involves nested sampling units and stratification. This complexity increases when the strata have too few sampled units for variance estimation, necessitating the use of collapsed strata, where multiple strata are combined to ensure an adequate sample size. When collapsing strata, two cases can be distinguished depending on whether a size variable associated with the variable of interest is available at the stratum level.•We present computer-implementable formulas for total, mean, and ratio estimators, along with their corresponding sampling variance estimators, for stratified two-stage simple random sampling without replacement, and we provide ready-to-use algorithms.•We introduce two methods for grouping strata: (1) a deterministic approach that uses stratum codes to define an ordinal variable, which orders the strata, and (2) a stochastic method that aims to minimize within-group inertia, which measures the heterogeneity within the newly formed groups of strata.•We emphasize that, unlike the correlation between a size variable and the variable of interest at the stratum level, the bias of the sampling variance estimator for the collapsed strata technique is not invariant to linear transformations. It follows that a high correlation does not ensure a low-bias estimator of the sampling variance.
Keywords: Collapsed strata; Combinatorial optimization; Computation in stratified two-stage simple random sampling without replacement, with possible collapsed strata; Expansion estimator; Grouping strata; Size variable; Stratified two-stage sampling; Woodruff’s method.
© 2024 The Author(s).