Multi-block Analysis of Genomic Data Using Generalized Canonical Correlation Analysis

Genomics Inform. 2018 Dec;16(4):e33. doi: 10.5808/GI.2018.16.4.e33. Epub 2018 Dec 28.

Abstract

Recently, there have been many studies in medicine related to genetic analysis. Many genetic studies have been studied to find genes associated with complex diseases. To find out how genes are related to disease, we need to understand not only the simple relationship of genotypes but also the way they are related to phenotype. Multi-block data, which is summation form of variable sets, is used for enhancing analysis of different block's relationship. By identifying relationships through multi-block data form, we can understand the association between the blocks is effective in understanding the correlation between them. Several statistical analysis methods have been developed to understand the relationship between multi-block data. In this paper, we will use generalized canonical correlation methodology to analyze multi-block data from Korean Association Resource (KARE) project which has combination of the SNP blocks, phenotype blocks, and disease block.

Keywords: Multi-block analysis; generalized canonical correlation analysis; genome-wide association study.