Pattern-selection based power analysis and discrimination of low- and high-grade myelodysplastic syndromes study using SNP arrays

PLoS One. 2009;4(4):e5054. doi: 10.1371/journal.pone.0005054. Epub 2009 Apr 8.

Abstract

Copy Number Aberration (CNA) in myelodysplastic syndromes (MDS) study using single nucleotide polymorphism (SNP) arrays have been received increasingly attentions in the recent years. In the current study, a new Constraint Moving Average (CMA) algorithm is adopted to determine the regions of CNA regions first. In addition to large regions of CNA, using the proposed CMA algorithm, small regions of CNA can also be detected. Real-time Polymerase Chain Reaction (qPCR) results prove that the CMA algorithm presents an insightful discovery of both large and subtle regions. Based on the results of CMA, two independent applications are studied. The first one is power analysis for sample estimation. An accurate estimation of sample size needed for the desired purpose of an experiment will be important for effort-efficiency and cost-effectiveness. The power analysis is performed to determine the minimum sample size required for ensuring at least (0<lambda <or=) detected regions statistically different from normal references. As expected, power increase with increasing sample size for a fixed significance level. The second application is the distinguishment of high-grade MDS patients from low-grade ones. We propose to calculate the General Variant Level (GVL) score to integrate the general information of each patient at genotype level, and use it as the unified measurement for the classification. Traditional MDS classifications usually refer to cell morphology and The International Prognostic Scoring System (IPSS), which belongs to the classification at the phenotype level. The proposed GVL score integrates the information of CNA region, the number of abnormal chromosomes and the total number of the altered SNPs at the genotype level. Statistical tests indicate that the high and low grade MDS patients can be well separated by GVL score, which appears to correlate better with clinical outcome than the traditional classification approaches using morphology and IPSS sore at the phenotype level.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Algorithms
  • Gene Dosage*
  • Humans
  • Myelodysplastic Syndromes / genetics
  • Myelodysplastic Syndromes / pathology*
  • Polymerase Chain Reaction
  • Polymorphism, Single Nucleotide*