Model-based approaches to synthesize microarray data: a unifying review using mixture of SEMs

Stat Methods Med Res. 2013 Dec;22(6):567-82. doi: 10.1177/0962280211419482. Epub 2011 Sep 25.

Abstract

Several statistical methods are nowadays available for the analysis of gene expression data recorded through microarray technology. In this article, we take a closer look at several Gaussian mixture models which have recently been proposed to model gene expression data. It can be shown that these are special cases of a more general model, called the mixture of structural equation models (mixture of SEMs), which has been developed in psychometrics. This model combines mixture modelling and SEMs by assuming that component-specific means and variances are subject to a SEM. The connection with SEM is useful for at least two reasons: (1) it shows the basic assumptions of existing methods more explicitly and (2) it helps in straightforward development of alternative mixture models for gene expression data with alternative mean/covariance structures. Different specifications of mixture of SEMs for clustering gene expression data are illustrated using two benchmark datasets.

Keywords: biclustering; correlated data; microarray data; mixture of SEMs; simultaneous clustering and dimensional reduction.

Publication types

  • Review

MeSH terms

  • Cluster Analysis
  • Models, Theoretical*
  • Multigene Family
  • Oligonucleotide Array Sequence Analysis*