This study was designed to reveal the genome-wide distribution of presence/absence variation (PAV) and to establish a database of polymorphic PAV markers in soybean. The 33 soybean whole-genome sequences were compared to each other with that of Williams 82 as a reference genome. A total of 33,127 PAVs were detected and 28,912 PAV markers with their primer sequences were designed as the database NJAUSoyPAV_1.0. The PAVs scattered on whole genome while only 518 (1.8%) overlapped with simple sequence repeats (SSRs) in BARCSOYSSR_1.0 database. In a random sample of 800 PAVs, 713 (89.13%) showed polymorphism among the 12 differential genotypes. Using 126 PAVs and 108 SSRs to test a Chinese soybean germplasm collection composed of 828 Glycine soja Sieb. et Zucc. and Glycine max (L.) Merr. accessions, the per locus allele number and its variation appeared less in PAVs than in SSRs. The distinctness among alleles/bands of PCR (polymerase chain reaction) products showed better in PAVs than in SSRs, potential in accurate marker-assisted allele selection. The association mapping results showed SSR + PAV was more powerful than any single marker systems. The NJAUSoyPAV_1.0 database has enriched the source of PCR markers, and may fit the materials with a range of per locus allele numbers, if jointly used with SSR markers.
Keywords: Polymorphism; presence/absence variation; simple sequence repeat; soybean; whole-genome sequence.
© 2014 Institute of Botany, Chinese Academy of Sciences.