Detection of genotyping errors is a necessary step to minimize false results in genetic analysis. This is especially important when the rate of genotyping errors is high, as has been reported for high-throughput sequence data. To detect genotyping errors in pedigrees, Mendelian inconsistent (MI) error checks exist, as do multi-point methods that flag Mendelian consistent (MC) errors for sparse multi-allelic markers. However, few methods exist for detecting MC genotyping errors, particularly for dense variants on large pedigrees. Here, we introduce an efficient method to detect MC errors even for very dense variants (e.g., SNPs and sequencing data) on pedigrees that may be large. Our method first samples inheritance vectors (IVs) using a moderately sparse but informative set of markers using a Markov chain Monte Carlo-based sampler. Using sampled IVs, we considered two test statistics to detect MC genotyping errors: the percentage of IVs inconsistent with observed genotypes (A1) or the posterior probability of error configurations (A2). Using simulations, we show that this method, even with the simpler A1 statistic, is effective for detecting MC genotyping errors in dense variants, with sensitivity almost as high as the theoretical best sensitivity possible. We also evaluate the effectiveness of this method as a function of parameters, when including the observed pattern for genotype, density of framework markers, error rate, allele frequencies, and number of sampled inheritance vectors. Our approach provides a line of defense against false findings based on the use of dense variants in pedigrees.
Keywords: computer program; data cleaning; exome; genome scan; high-throughput.
© 2014 WILEY PERIODICALS, INC.