Insights on bias and information in group-level studies

Biostatistics. 2003 Apr;4(2):265-78. doi: 10.1093/biostatistics/4.2.265.

Abstract

Ecological and aggregate data studies are examples of group-level studies. Even though the link between the predictors and outcomes is not preserved in these studies, inference about individual-level exposure effects is often a goal. The disconnection between the level of inference and the level of analysis expands the array of potential biases that can invalidate the inference from group-level studies. While several sources of bias, specifically due to measurement error and confounding, may be more complex in group-level studies, two sources of bias, cross-level and model specification bias, are a direct consequence of the disconnection. With the goal of aligning inference from individual versus group-level studies, I discuss the interplay between exposure and study design. I specify the additional assumptions necessary for valid inference, specifically that the between- and within-group exposure effects are equal. Then cross-level inference is possible. However, all the information in the group-level analysis comes from between-group comparisons. Models where the group-level analysis provides even a small percentage of information about the within-group exposure effect are most susceptible to model specification bias. Model specification bias can be even more serious when the group-level model isn't derived from an individual-level model.

Publication types

  • Research Support, U.S. Gov't, P.H.S.

MeSH terms

  • Analysis of Variance
  • Bias
  • Causality
  • Computer Simulation
  • Data Collection / statistics & numerical data
  • Dietary Fats / adverse effects
  • Environmental Exposure
  • Epidemiologic Research Design*
  • Humans
  • Models, Statistical*
  • Neoplasms / epidemiology
  • Risk
  • Statistical Distributions

Substances

  • Dietary Fats