A central issue in genome-wide association (GWA) studies is assessing statistical significance while adjusting for multiple hypothesis testing. An equally important question is the statistical efficiency of the GWA design as compared to the traditional sequential approach in which genome-wide linkage analysis is followed by region-wise association mapping. Nevertheless, GWA is becoming more popular due in part to cost efficiency: commercially available 1M chips are nearly as inexpensive as a custom-designed 10 K chip. It is becoming apparent, however, that most of the on-going GWA studies with 2,000-5,000 samples are in fact underpowered. As a means to improve power, we emphasize the importance of utilizing prior information such as results of previous linkage studies via a stratified false discovery rate (FDR) control. The essence of the stratified FDR control is to prioritize the genome and maintain power to interrogate candidate regions within the GWA study. These candidate regions can be defined as, but are by no means limited to, linkage-peak regions. Furthermore, we theoretically unify the stratified FDR approach and the weighted P-value method, and we show that stratified FDR can be formulated as a robust version of weighted FDR. Finally, we demonstrate the utility of the methods in two GWA datasets: Type 2 diabetes (FUSION) and an on-going study of long-term diabetic complications (DCCT/EDIC). The methods are implemented as a user-friendly software package, SFDR. The same stratification framework can be readily applied to other type of studies, for example, using GWA results to improve the power of sequencing data analyses.
2009 Wiley-Liss, Inc.