Massively parallel homogeneous amplification of chip-scale DNA for DNA information storage (MPHAC-DIS)

Nat Commun. 2025 Jan 14;16(1):667. doi: 10.1038/s41467-025-55986-9.

Abstract

Chip scale DNA synthesis offers a high-throughput and cost-effective method for large-scale DNA-based information storage. Nevertheless, unbiased information retrieval from low-copy-number sequences remains a barricade that largely arises from the indispensable DNA amplification. Here, we devise a simulation-guided quantitative primer-template hybridization strategy to realize massively parallel homogeneous amplification of chip-scale DNA for DNA information storage (MPHAC-DIS). Using a fixed-energy primer design, we demonstrate the unbiasedness of MPHAC for amplifying 100,000-plex sequences. Simulations reveal that MPHAC achieves a fold-80 value of 1.0 compared to 3.2 with conventional fixed-length primers, lowering costs by up to four orders of magnitude through reduced over-sequencing. The MPHAC-DIS system using 35,406 encoded oligonucleotide allows simultaneous access of multimedia files including text, images, and videos with high decoding accuracy at very low sequencing depths. Specifically, even a ~ 1 × sequencing depth, with the combination of machine learning, results in an acceptable decoding accuracy of ~80%. The programmable and predictable MPHAC-DIS method thus opens new door for DNA-based large-scale data storage with potential industrial applications.

MeSH terms

  • DNA Primers / genetics
  • DNA* / chemistry
  • DNA* / genetics
  • High-Throughput Nucleotide Sequencing / methods
  • Humans
  • Information Storage and Retrieval / methods
  • Machine Learning
  • Nucleic Acid Amplification Techniques / methods
  • Oligonucleotide Array Sequence Analysis / methods
  • Sequence Analysis, DNA / methods

Substances

  • DNA
  • DNA Primers