Chip scale DNA synthesis offers a high-throughput and cost-effective method for large-scale DNA-based information storage. Nevertheless, unbiased information retrieval from low-copy-number sequences remains a barricade that largely arises from the indispensable DNA amplification. Here, we devise a simulation-guided quantitative primer-template hybridization strategy to realize massively parallel homogeneous amplification of chip-scale DNA for DNA information storage (MPHAC-DIS). Using a fixed-energy primer design, we demonstrate the unbiasedness of MPHAC for amplifying 100,000-plex sequences. Simulations reveal that MPHAC achieves a fold-80 value of 1.0 compared to 3.2 with conventional fixed-length primers, lowering costs by up to four orders of magnitude through reduced over-sequencing. The MPHAC-DIS system using 35,406 encoded oligonucleotide allows simultaneous access of multimedia files including text, images, and videos with high decoding accuracy at very low sequencing depths. Specifically, even a ~ 1 × sequencing depth, with the combination of machine learning, results in an acceptable decoding accuracy of ~80%. The programmable and predictable MPHAC-DIS method thus opens new door for DNA-based large-scale data storage with potential industrial applications.
© 2025. The Author(s).