High-density single-nucleotide polymorphism (SNP) microarrays provide a useful tool for the detection of copy number variants (CNVs). The analysis of such large amounts of data is complicated, especially with regard to determining where copy numbers change and their corresponding values. In this article, we propose a Bayesian multiple change-point model (BMCP) for segmentation and estimation of SNP microarray data. Segmentation concerns separating a chromosome into regions of equal copy number differences between the sample of interest and some reference, and involves the detection of locations of copy number difference changes. Estimation concerns determining true copy number for each segment. Our approach not only gives posterior estimates for the parameters of interest, namely locations for copy number difference changes and true copy number estimates, but also useful confidence measures. In addition, our algorithm can segment multiple samples simultaneously, and infer both common and rare CNVs across individuals. Finally, for studies of CNVs in tumors, we incorporate an adjustment factor for signal attenuation due to tumor heterogeneity or normal contamination that can improve copy number estimates.
© 2009, The International Biometric Society.