A highly polymorphic panel of 40-plex microhaplotypes for the Chinese Han population and its application in estimating the number of contributors in DNA mixtures

Forensic Sci Int Genet. 2022 Jan:56:102600. doi: 10.1016/j.fsigen.2021.102600. Epub 2021 Oct 8.

Abstract

Microhaplotypes (MHs) have great potential in multiple forensic applications and have proven to be promising markers in complex DNA mixture analysis. In this study, we developed a multiplex panel of 40 highly polymorphic MHs for the Chinese Han population, evaluated its forensic values, and explored its application in predicting the number of contributors (NOCs) in DNA mixtures. The panel consisted of 20 newly proposed loci and 20 previously reported loci with lengths spanning less than 120 bp. The average effective number of alleles (Ae) was 3.77, and the cumulative matching probability (CMP) and the cumulative power of exclusion (CPE) reached 1.2E-37 and 1-2.1E-12, respectively, in the Chinese Han population from the 1000 Genomes Project. Further validation on 150 Chinese Han individuals showed that Ae ranged from 2.62 to 4.41 with a mean value of 3.61, and CMP and CPE were 3.61E-36 and 1-1.84E-12, respectively, indicating that this panel was informative for personal identification and paternity testing in the studied population. To estimate NOC in DNA mixtures, we developed a machine learning model based on this panel. As a result, the accuracies in artificial DNA mixtures reached 95.24% for 2- to 4-person mixtures and 83.33% for 2- to 6-person mixtures. Furthermore, the NOC estimation on simulated profiles with allele dropout showed that this panel was still robust under slight dropout. In conclusion, this panel has value for forensic identification and NOC estimation of DNA mixtures.

Keywords: Forensic; Machine learning; Microhaplotype; Number of contributors.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • China
  • DNA / genetics
  • DNA Fingerprinting*
  • Gene Frequency
  • Haplotypes
  • Humans
  • Microsatellite Repeats
  • Polymorphism, Single Nucleotide*
  • Sequence Analysis, DNA

Substances

  • DNA