Composite DNA letters, by merging all four DNA nucleotides in specified ratios, offer a pathway to substantially increase the logical density of DNA digital storage (DDS) systems. However, these letters are susceptible to nucleotide errors and sampling bias, leading to a high letter error rate, which complicates precise data retrieval and augments reading expenses. To address this, Derrick-cp is introduced as an innovative soft-decision decoding algorithm tailored for DDS utilizing composite letters. Derrick-cp capitalizes on the distinctive error sensitivities among letters to accurately predict and rectify letter errors, thus enhancing the error-correcting performance of Reed-Solomon codes beyond traditional hard-decision decoding limits. Through comparative analyses in the existing dataset and simulated experiments, Derrick-cp's superiority is validated, notably halving the sequencing depth requirement and slashing costs by up to 22% against conventional hard-decision strategies. This advancement signals Derrick-cp's significant role in elevating both the precision and cost-efficiency of composite letter-based DDS.
Keywords: DNA digital storage (DDS); composite DNA letter; error‐correcting code (ECC); soft‐decision decoding.
© 2024 The Author(s). Advanced Science published by Wiley‐VCH GmbH.