Could routine forensic STR genotyping data leak personal phenotypic information?

Forensic Sci Int. 2022 Jun:335:111311. doi: 10.1016/j.forsciint.2022.111311. Epub 2022 Apr 18.

Abstract

The application of forensic genetic markers must comply with privacy rights and legal policies on a premise that the markers do not expose phenotypic information. The most widely-used short tandem repeats (STRs) are generally viewed as 'junk' DNA because most STRs are located in non-coding regions and therefore refrain from leaking phenotypic traits. But with a deepening understanding of phenotypes and underlying genetic structure, whether STRs could potentially reflect any phenotypic information may need re-examining. Therefore, we performed the following analyses. First, we analyzed the association between 15 STRs and three facial characteristics (single or double eyelid, with or without epicanthus, unattached or attached earlobe) on 721 unrelated Han Chinese individuals. Then, we collected 27199 individuals' STRs and geographic data from the literature to investigate the association between STRs and bio-geographic information, and predict geographic information by STRs on additional 1993 unrelated individuals. We found that there was scarcely any association between STRs with studied facial characteristics. Although allele19 in D2S1338 and allele 18 in FGA (P = 0.0032, P = 0.0030, respectively after Bonferroni correction) showed statistical significance, the prediction effectiveness was very low. For the STRs and bio-geographic information, the principal component analysis showed the first three components could explain 87.7% of the variance, but the prediction accuracy only reached 25.2%. We demonstrated that the forensic phenotypes are usually complex traits, it is hardly possible to uncover phenotypic information by testing only dozens of STR loci.

Keywords: Association analysis; Forensic phenotype; Phenotype prediction; Short tandem repeat.

MeSH terms

  • Asian People
  • DNA Fingerprinting
  • Forensic Genetics*
  • Gene Frequency
  • Genetics, Population
  • Genotype
  • Humans
  • Microsatellite Repeats*
  • Phenotype