Background: Chronic diseases are becoming a critical challenge to the aging Chinese population. Biobanks with extensive genomic and environmental data offer opportunities to elucidate the complex gene-environment interactions underlying their aetiology. Genome-wide genotyping array remains an efficient approach for large-scale genomic data collection. However, most commercial arrays have reduced performance for biobanking in the Chinese population.
Materials and methods: Deep whole-genome sequencing data from 2 641 Chinese individuals were used as a reference to develop the CAS array, a custom-designed genotyping array for precision medicine. Evaluation of the array was performed by comparing data from 384 individuals assayed both by the array and whole-genome sequencing. Validation of its mitochondrial copy number estimating capacity was conducted by examining its association with established covariates among 10 162 Chinese elderly.
Results: The CAS Array adopts the proven Axiom technology and is restricted to 652 429 single-nucleotide polymorphism (SNP) markers. Its call rate of 99.79% and concordance rate of 99.89% are both higher than for commercial arrays. Its imputation-based genome coverage reached 98.3% for common SNPs and 63.0% for low-frequency SNPs, both comparable to commercial arrays with larger SNP capacity. After validating its mitochondrial copy number estimates, we developed a publicly available software tool to facilitate the array utility.
Conclusion: Based on recent advances in genomic science, we designed and implemented a high-throughput and low-cost genotyping array. It is more cost-effective than commercial arrays for large-scale Chinese biobanking.
Keywords: SNP array; chronic disease; genotyping; mitochondrial copy number; precision medicine; single-nucleotide polymorphism (SNP).
© The Author(s) 2023. Published by Oxford University Press on behalf of the West China School of Medicine & West China Hospital of Sichuan University.